[go: nahoru, domu]

Jump to content

Code page 437

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Monedula (talk | contribs) at 00:41, 28 January 2005 (table format). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

IBM PC or MS-DOS Codepage 437, also known as DOS-US or OEM-US, is the original character set of the IBM PC, from 1981. The character set is laid out like this:

.0 .1 .2 .3 .4 .5 .6 .7 .8 .9 .A .B .C .D .E .F
 
0.
 
�
0

263A

263B

2665

2666

2663

2660

2022

25D8

25CB

25D9

2642

2640

266A

266B

263C
 
1.
 

25BA

25C4

2195

203C

B6
§
A7

25AC

21A8

2191

2193

2192

2190

221F

2194

25B2

25BC
 
2.
 

20
!
21
"
22
#
23
$
24
%
25
&
26
'
27
(
28
)
29
*
2A
+
2B
,
2C
-
2D
.
2E
/
2F
 
3.
 
0
30
1
31
2
32
3
33
4
34
5
35
6
36
7
37
8
38
9
39
:
3A
;
3B
<
3C
=
3D
>
3E
?
3F
 
4.
 
@
40
A
41
B
42
C
43
D
44
E
45
F
46
G
47
H
48
I
49
J
4A
K
4B
L
4C
M
4D
N
4E
O
4F
 
5.
 
P
50
Q
51
R
52
S
53
T
54
U
55
V
56
W
57
X
58
Y
59
Z
5A
[
5B
\
5C
]
5D
^
5E
_
5F
 
6.
 
`
60
a
61
b
62
c
63
d
64
e
65
f
66
g
67
h
68
i
69
j
6A
k
6B
l
6C
m
6D
n
6E
o
6F
 
7.
 
p
70
q
71
r
72
s
73
t
74
u
75
v
76
w
77
x
78
y
79
z
7A
{
7B
|
7C
}
7D
~
7E

2302
 
8.
 
Ç
C7
ü
FC
é
E9
â
E2
ä
E4
à
E0
å
E5
ç
E7
ê
EA
ë
EB
è
E8
ï
EF
î
EE
ì
EC
Ä
C4
Å
C5
 
9.
 
É
C9
æ
E6
Æ
C6
ô
F4
ö
F6
ò
F2
û
FB
ù
F9
ÿ
FF
Ö
D6
Ü
DC
¢
A2
£
A3
¥
A5

20A7
ƒ
192
 
A.
 
á
E1
í
ED
ó
F3
ú
FA
ñ
F1
Ñ
D1
ª
AA
º
BA
¿
BF

2310
¬
AC
½
BD
¼
BC
¡
A1
«
AB
»
BB
 
B.
 

2591

2592

2593

2502

2524

2561

2562

2556

2555

2563

2551

2557

255D

255C

255B

2510
 
C.
 

2514

2534

252C

251C

2500

253C

255E

255F

255A

2554

2569

2566

2560

2550

256C

2567
 
D.
 

2568

2564

2565

2559

2558

2552

2553

256B

256A

2518

250C

2588

2584

258C

2590

2580
 
E.
 
α
3B1
ß
DF
Γ
393
π
3C0
Σ
3A3
σ
3C3
µ
B5
τ
3C4
Φ
3A6
Θ
398
Ω
3A9
δ
3B4

221E
φ
3C6
ε
3B5

2229
 
F.
 

2261
±
B1

2265

2264

2320

2321
÷
F7

2248
°
B0

2219
·
B7

221A

207F
²
B2

25A0
 
A0

It is based on ASCII, with the following modifications:

  • The C0 control range (0x00-0x1F hex) is mapped to graphics characters. The codes can assume their original function as controls (as they still do--typing "echo", space, control-G and then Enter causes the PC speaker to emit a beep, even on the command prompt on Windows XP), but in display, for example in a screen editor like MS-DOS edit, they show as graphics. The graphics are various, such as smiling faces, card suites and musical notes. Code 0x7F, DEL, similarly shows as a graphic (a house).
  • The high-bit range, 0x80-0xFF, is mapped to various symbols: a few European characters (accented Latin vowels, etc) in no particular order and not sufficient for representation of most Western European languages, box-drawing characters, mathematical symbols and a few Greek letters.

The repertoire of CP437 was taken from the character set of Wang word-processing machines, as explicitly admitted by Bill Gates in the interview of him and Paul Allen in the 2nd of October 1995 edition of Fortune Magazine:

"... we were also fascinated by dedicated word processors from Wang, because we believed that general-purpose machines could do that just as well. That's why, when it came time to design the keyboard for the IBM PC, we put the funny Wang character set into the machine--you know, smiley faces and boxes and triangles and stuff. We were thinking we'd like to do a clone of Wang word-processing software someday."

CP437 is inadequate for internationalisation, as it lacks characters necessary for some languages, such as À (capital A with grave) for French, and has only a few Greek letters. Later MS-DOS character sets, such as CP850 (DOS Latin-1), CP852 (DOS Central-European) and CP737 (DOS Greek), filled the gaps for international use while still being nearly compatible with CP437 by retaining the box-drawing characters. All CP437 characters are in Unicode and in Microsoft's WGL4 character set, therefore in most of the fonts on Microsoft Windows, and also in the VGA font of Linux (as well as in the ISO 10646 fonts for X11, of course).

Implementors of mapping tables to Unicode should note that CP437 unifies some characters: 0xE1 is both the German sharp S (U+00DF) and the Greek lowercase beta (U+03B2); 0xE4 is both the n-ary summation sign (U+2211) and the Greek uppercase sigma (U+03A3); 0xE6 is both the micro sign (U+00B5) and the Greek lowercase mu (U+03BC); 0xEA is both the Ohm sign (U+2126) and the Greek uppercase omega (U+03A9); and 0xEE is both the element-of sign (U+2208) and the Greek lowercase epsilon (U+03B5).

See also

External link