Hallo, dies ist ein Test.

PWD: /www/data-lst1/unixsoft/unixsoft/kaempfer/.public_html

Running in File Mode
Relative path: ./../../../../../../usr/man/man7/iconv_extra.7
Real path: /usr/share/man/man7/iconv_extra.7
Zurück

'\" te
.\" Copyright (c) 2014, 2023, Oracle and/or its affiliates.
.TH iconv_extra 7 "12 Sep 2023" "Oracle Solaris 11.4" "Standards, Environments, Macros, Character Sets, and miscellany"
.SH NAME
iconv_extra \- codeset conversion for non-Unicode encodings
.SH DESCRIPTION
.sp
.LP
\fBiconv\fR and \fBcconv\fR support conversions to and from a wide range of codesets.
.sp
.LP
The lists below provide basic information about encodings mainly for the EMEA regions. For information on Asian encodings, refer to \fBiconv_ja\fR(7), \fBiconv_ko\fR(7), \fBiconv_zh\fR(7), \fBiconv_zh_HK\fR(7), and \fBiconv_zh_TW\fR(7) manual pages. For information on Unicode encodings, refer to the \fBiconv_unicode\fR(7) manual page.
.sp
.LP
The codeset names shown are in their canonical form directly usable as \fIfromcode\fR or \fItocode\fR parameters to \fBiconv\fR(1), \fBiconv_open\fR(3C), and \fBcconv_open\fR(3C), with aliases in parentheses where applicable.
.sp
.LP
Available \fBiconv\fR and \fBcconv\fR conversions in the current system can be obtained by running \fBiconv -l\fR as described in the \fBiconv\fR(1) manual page.
.sp
.LP
For additional information on the mappings between canonical names and supported aliases with optional variant levels, refer to the \fBalias\fR(5) manual page and also the \fB/usr/lib/iconv/alias\fR file.
.SS "646 - ISO/IEC 646:1991 and Variants"
.sp
.LP
The codeset of the "\fBC\fR" locale in Oracle Solaris, the ISO basic Latin alphabet, is referred with the canonical name 646. Common aliases such as \fBUS-ASCII\fR and \fBASCII\fR are also defined.
.sp
.LP
The following national variants of the 646 codeset are also available:
.sp
.TS
tab(
) box;
lw(2.36i) |lw(3.14i) 
lw(2.36i) |lw(3.14i) 
.
646 Codeset
National Variant
_
\fB646de\fR
Germany
_
\fB646ch\fR
Switzerland
_
\fB646gb (646en)\fR
United Kingdom
_
\fB646fr\fR
France
_
\fB646ca\fR
Canada
_
\fB646fi\fR
Finland
_
\fB646sv\fR
Sweden
_
\fB646it\fR
Italy
_
\fB646dk (646da)\fR
Denmark
_
\fB646es (646sp)\fR
Spain
_
\fB646pt\fR
Portugal
.TE
.sp
.SS "ISO 8859 Character Sets"
.sp
.TS
tab(
) box;
lw(1.27i) |lw(4.23i) 
lw(1.27i) |lw(4.23i) 
.
ISO 8859 Character Set
Description
_
ISO8859-1 (Latin1)
T{
For most West European languages, including:
.sp
.in +2
.nf
Albanian   Finnish    Italian
Catalan    French     Norwegian
Danish     German     Portuguese
Dutch      Galician   Spanish
English    Irish      Swedish
Faeroese   Icelandic
.fi
.in -2
.sp

T}
_
ISO8859-2 (Latin2)
T{
For most Latin-written Slavic and Central European languages:
.sp
.in +2
.nf
Czech      Polish     Slovak
German     Rumanian   Slovene
Hungarian  Croatian
.fi
.in -2
.sp

T}
_
ISO8859-3 (Latin3)
T{
Used for Esperanto, Galician, Maltese, and Turkish.
T}
_
ISO8859-4 (Latin4)
T{
Introduces letters for Estonian, Latvian, and Lithuanian. It is an incomplete predecessor of ISO 8859-10.
T}
_
ISO8859-5
T{
For languages that use Cyrillic alphabet, such as Belarusian, Bulgarian, Macedonian, Russian, Serbian, and Ukrainian.
T}
_
ISO8859-6
Latin + Arabic.
_
ISO8859-7
T{
Latin + Greek. Does not include accents used in polytonic Greek.
T}
_
ISO8859-8
Latin + Hebrew.
_
ISO8859-9 (Latin5)
T{
Replaces the rarely needed Icelandic letters in ISO 8859-1 (Latin 1) with the Turkish ones.
T}
_
ISO8859-10 (Latin6)
T{
Adds the last Inuit (Greenlandic) and Sami (Lappish) letters that were not included in ISO 8859-4 (Latin 4) to complete coverage of the Nordic area.
T}
_
ISO8859-11
T{
Latin + Thai. ISO/IEC 8859-11:2001 is equivalent to TIS 620-2533 (1990) with the addition of \fB0xA0\fR NO-BREAK SPACE.
T}
_
ISO8859-13 (Latin7)
T{
Includes characters for Baltic languages which were missing from Latin-4 and Latin-6.
T}
_
ISO8859-14 (Latin8)
T{
Covers Celtic languages such as Gaelic and the Breton language.
T}
_
ISO8859-15 (Latin9)
T{
Variant of 8859-1 that modifies 8 less used characters and introduces the euro sign.
T}
_
ISO8859-16 (Latin10)
T{
Supports Albanian, Croatian, English, Finnish, French, German, Hungarian, Irish Gaelic (new orthography), Italian, Latin, Polish, Romanian, and Slovenian. The currency sign is replaced with the euro sign.
T}
.TE
.sp
.SS "IBM EBCDIC Code Pages"
.sp
.LP
\fBEBCDIC\fR (Extended Binary Coded Decimal Interchange Code) is an 8-bit character encoding mainly used in IBM mainframes. The following table outlines the basics on supported IBM EBCDIC-based code pages.
.sp
.LP
IBM PC and EBCDIC code pages are prefixed with "\fBIBM-\fR" as like \fBIBM-037\fR in the codeset name.
.sp
.TS
tab(
) box;
lw(1.03i) |lw(4.47i) 
lw(1.03i) |lw(4.47i) 
.
EBCDIC Code Page
Country/Region
_
\fBIBM-037\fR
Latin-1 character set
_
\fBIBM-273\fR
Austria, Germany
_
\fBIBM-277\fR
Denmark, Norway
_
\fBIBM-278\fR
Finland, Sweden
_
\fBIBM-280\fR
Italy
_
\fBIBM-284\fR
Latin America, Spain
_
\fBIBM-285\fR
Ireland, United Kingdom
_
\fBIBM-297\fR
France
_
\fBIBM-420\fR
Egypt, Iraq, Jordan, Saudi Arabia, Syria
_
\fBIBM-424\fR
Israel
_
\fBIBM-500\fR
T{
Australia, Austria, Belgium, Brazil, Canada, Denmark, Finland, France, Germany, Iceland, Ireland, Italy, Japan, Latin America, Multinational, Netherlands, New Zealand, Norway, Portugal, South Africa, Spain, Sweden, Switzerland, United Kingdom, and United States
T}
_
\fBIBM-838\fR
Thailand
_
\fBIBM-875\fR
Greece
_
\fBIBM-933\fR
Korea
_
\fBIBM-935\fR
Simplified Chinese
_
\fBIBM-937\fR
Traditional Chinese
_
\fBIBM-1025\fR
T{
Belarus, Bosnia-Herzegovina, Bulgaria, Macedonia (FYR), Montenegro, Russia, Serbia, Serbia-Montenegro, and Yugoslavia
T}
_
\fBIBM-1026\fR
Multinational, Turkey
_
\fBIBM-1112\fR
Estonia, Latvia, Lithuania
_
\fBIBM-1122\fR
Estonia
_
\fBIBM-1140\fR
T{
Australia, Brazil, Canada, Multinational, Netherlands, New Zealand, Portugal, South Africa, Taiwan, and United States
T}
_
\fBIBM-1141\fR
Austria, Germany
_
\fBIBM-1142\fR
Denmark, Norway
_
\fBIBM-1143\fR
Finland, Sweden
_
\fBIBM-1144\fR
Italy
_
\fBIBM-1145\fR
Latin America, Spain
_
\fBIBM-1146\fR
Ireland, United Kingdom
_
\fBIBM-1147\fR
France
_
\fBIBM-1148\fR
T{
Australia, Austria, Belgium, Brazil, Canada, Denmark, Finland, France, Germany, Iceland, Ireland, Italy, Japan, Latin America, Multinational, Netherlands, New Zealand, Norway, Portugal, South Africa, Spain, Sweden, Switzerland, United Kingdom, and United States
T}
_
\fBIBM-1149\fR
Iceland
.TE
.sp
.SS "IBM-PC Code Pages"
.sp
.LP
The following table covers the supported IBM-PC (DOS and Windows) code pages.
.sp
.TS
tab(
) box;
lw(0.97i) |lw(4.53i) 
lw(0.97i) |lw(4.53i) 
.
IBM-PC Code Page 
Country/Region
_
\fBIBM-850\fR
T{
Albania, Australia, Austria, Belgium, Bosnia-Herzegovina, Brazil, Bulgaria, Canada, Croatia, Czech Republic, Denmark, Egypt, Finland, France, Germany, Greece, Hungary, Iceland, Iraq, Ireland, Italy, Jordan, Latin America, Multinational, Netherlands, New Zealand, Norway, Poland, Portugal, Romania, Russia, Saudi Arabia, Slovakia, Slovenia, South Africa, Spain, Sweden, Switzerland, Syria, United Kingdom, and United States
T}
_
\fBIBM-852\fR
T{
Albania, Bosnia-Herzegovina, Croatia, Czech Republic, Hungary, Multinational, Poland, Romania, Slovakia, and Slovenia
T}
_
\fBIBM-855\fR
T{
Bosnia-Herzegovina, Bulgaria, Macedonia (FYR), Montenegro, Multinational, Serbia, Serbia-Montenegro, and Yugoslavia
T}
_
\fBIBM-856\fR
Israel
_
\fBIBM-857\fR
Multinational, Turkey
_
\fBIBM-862\fR
Israel
_
\fBIBM-864\fR
T{
Egypt, Iraq, Jordan, Saudi Arabia, and Syria
T}
_
\fBIBM-866\fR
Russia
_
\fBIBM-869\fR
Greece
_
\fBIBM-870\fR
T{
Albania, Bosnia-Herzegovina, Croatia, Czech Republic, Hungary, Multinational, Poland, Romania, Slovakia, and Slovenia
T}
_
\fBIBM-871\fR
Iceland
_
\fBIBM-874\fR
Thailand
_
\fBIBM-921\fR
Estonia, Latvia, Lithuania
_
\fBIBM-922\fR
Estonia
.TE
.sp
.SS "Microsoft Code Pages"
.sp
.LP
The following table covers the supported Microsoft DOS and Windows code pages. Microsoft code pages are prefixed with "\fBCP\fR" as like \fBCP850\fR in the codeset name.
.sp
.TS
tab(
) box;
lw(1.27i) |lw(4.23i) 
lw(1.27i) |lw(4.23i) 
.
Code Page
Description
_
\fBCP437\fR
MS-DOS, Latin United States
_
\fBCP720\fR
MS-DOS, Arabic
_
\fBCP737\fR
MS-DOS, Greek
_
\fBCP775\fR
MS-DOS, Baltic
_
\fBCP850\fR
MS-DOS, Multilingual Latin I
_
\fBCP852\fR
MS-DOS, Latin II
_
\fBCP855\fR
MS-DOS, Cyrillic
_
\fBCP857\fR
MS-DOS, Turkish
_
\fBCP860\fR
MS-DOS, Portuguese
_
\fBCP861\fR
MS-DOS, Icelandic
_
\fBCP862\fR
MS-DOS, Hebrew
_
\fBCP863\fR
MS-DOS, French Canada
_
\fBCP864\fR
MS-DOS, Arabic
_
\fBCP865\fR
MS_DOS, Nordic
_
\fBCP866\fR
MS-DOS, Cyrillic (Russian)
_
\fBCP869\fR
MS-DOS, Greek 2
_
\fBCP874\fR
MS-DOS, Thai
_
\fBCP949\fR
Windows, Korean
_
\fBCP1250\fR
Windows, Central Europe
_
\fBCP1251\fR
Windows, Cyrillic
_
\fBCP1252\fR
Windows, Latin
_
\fBCP1253\fR
Windows, Greek
_
\fBCP1254\fR
Windows, Turkish
_
\fBCP1255\fR
Windows, Hebrew
_
\fBCP1256\fR
Windows, Arabic
_
\fBCP1257\fR
Windows, Baltic
_
\fBCP1258\fR
Windows, Vietnam
.TE
.sp
.SS "Other Code Pages"
.sp
.TS
tab(
) box;
lw(0.87i) |lw(4.63i) 
lw(0.87i) |lw(4.63i) 
.
Code Page
Description
_
\fBKOI8-R\fR, \fBKOI8-U\fR
T{
8-bit codesets for Russian and Ukrainian Cyrillic
T}
_
\fBPTCP154\fR
T{
Pratype \fBCP154\fR for Cyrillic; based on \fBCP1251\fR with added Asian Cyrillic symbols
T}
_
\fBALT\fR
8-bit Alternative PC Cyrillic
_
\fBMAC\fR
8-bit Macintosh Cyrillic
_
\fBDHN\fR
T{
Dom Handlowy Nauki, 8-bit codeset for Polish text
T}
_
\fBMazovia\fR
8-bit codeset for Polish text.
_
\fBVISCII\fR
T{
Vietnamese Standard Code for Information Interchange is a modification of ASCII for Vietnamese.
T}
_
\fBTCVN\fR
T{
Vietnamese Standard Code for Information Interchange TCVN 5712:1993.
T}
_
\fBTIS-620\fR
(\fBTIS620-2533\fR, \fBEUC-TH\fR)
T{
Thai Industrial Standard 620-2533 is practically identical to the ISO 8859-11 codeset (see above).
T}
_
\fBISCII (ISCII91)\fR
T{
Indian Script Code for Information Interchange is an ASCII-compatible codeset for Indic scripts.
T}
_
\fBACE\fR
(\fBIDNA2008-REGIST\fR)
T{
ASCII Compatible Encoding defined in the RFCs 3490, 3492, and 5890 without allowing unassigned characters; it also uses STD3 ASCII rules.
\fBIDNA2008-REGIST\fR is an alias to \fBACE\fR utilizing the IDNA2008 terminologies described in RFC 5890. 
T}
_
\fBACE-ALLOW-UNASSIGNED\fR
(\fBAIDNA2008-LOOKUP\fR)
T{
Same as ACE except that it allows unassigned characters. It's more suitable for query purposes; the ACE is more suitable for storing or giving host or domain names to machines.
\fBIDNA2008-LOOKUP\fR is an alias for \fBACE-ALLOW-UNASSIGNED\fR utilizing the IDNA2008 terminologies described in RFC 5890.
T}
.TE
.sp
.SH FILES
.sp
.ne 2
.mk
.na
\fB\fB/usr/lib/iconv/*.so\fR\fR
.ad
.br
.sp .6
.RS 4n
32-bit \fBiconv\fR conversion modules
.RE

.sp
.ne 2
.mk
.na
\fB\fB/usr/lib/iconv/{amd64,sparcv9}/*.so\fR\fR
.ad
.br
.sp .6
.RS 4n
64-bit \fBiconv\fR conversion modules
.RE

.sp
.ne 2
.mk
.na
\fB\fB/usr/lib/iconv/*.bt\fR\fR
.ad
.br
.sp .6
.RS 4n
cconv code conversion binary tables for iconv(1), cconv(3C) and iconv(3C)
.RE

.sp
.ne 2
.mk
.na
\fB\fB/usr/lib/iconv/geniconvtbl/binarytables/*.bt\fR\fR
.ad
.br
.sp .6
.RS 4n
\fBgeniconvtbl\fR conversion binary tables
.RE

.sp
.ne 2
.mk
.na
\fB\fB/usr/lib/iconv/alias\fR\fR
.ad
.br
.sp .6
.RS 4n
Alias table file of codeset names
.RE

.SH SEE ALSO
.sp
.LP
\fBgeniconvtbl\fR(1), \fBiconv\fR(1), \fBcconv\fR(3C), \fBcconv_close\fR(3C), \fBcconv_open\fR(3C), \fBcconvctl\fR(3C), \fBiconv\fR(3C), \fBiconv_close\fR(3C), \fBiconv_open\fR(3C), \fBiconvctl\fR(3C), \fBalias\fR(5), \fBgeniconvtbl-cconv\fR(5), \fBiconv_ja\fR(7), \fBiconv_ko\fR(7), \fBiconv_unicode\fR(7), \fBiconv_zh\fR(7), \fBiconv_zh_HK\fR(7), \fBiconv_zh_TW\fR(7)
.sp
.LP
Chernov, A., Registration of a Cyrillic Character Set, RFC 1489, RELCOM Development Team, July 1993.
.sp
.LP
Nussbacher, H., and Y. Bourvine, Hebrew Character Encoding for Internet Messages, RFC 1555, Israeli Inter-University, Hebrew University, December 1993.
.sp
.LP
Reynolds, J., and J. Postel, ASSIGNED NUMBERS, RFC 1700, University of Southern California/Information Sciences Institute, October 1994.
.sp
.LP
Simonson, K., Character Mnemonics & Character Sets, RFC 1345, Rationel Almen Planlaegning, June 1992.
.sp
.LP
Spinellis, D., Greek Character Encoding for Electronic Mail Messages, RFC 1947, SENA S.A., May 1996.