[go: nahoru, domu]

Jump to content

Code page 437: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
m – for ranges
No edit summary
Line 304: Line 304:
It is based on [[ASCII]], with the following modifications:
It is based on [[ASCII]], with the following modifications:


* The C0 control range (0x00–0x1F [[hexadecimal|hex]]) is mapped to graphics characters. The codes can assume their original function as controls (as they still do—typing "echo", space, control-G and then Enter causes the PC speaker to emit a beep—even on the command prompt on [[Windows XP]]), but in display, for example in a screen editor like MS-DOS edit, they show as graphics. The graphics are various, such as smiling faces, [[Playing cards|card suits]] and musical notes. Code 0x7F, DEL, similarly shows as a graphic (a house).
* The C0 control range (0x00–0x1F [[hexadecimal|hex]]) is mapped to graphics characters. The codes can assume their original function as controls (as they still do—typing "echo", space, control-G and then Enter causes the PC speaker to emit a beep—even on the command prompt on [[Windows XP]]), but when placed in display ram, for example in a screen editor like MS-DOS edit, they show as graphics. The graphics are various, such as smiling faces, [[Playing cards|card suits]] and musical notes. Code 0x7F, DEL, similarly shows as a graphic (a house).


* The high-bit range, 0x80–0xFF, is mapped to various symbols: a few European characters (accented Latin [[vowel]]s, etc) in no particular order and not sufficient for representation of most Western European languages, box-drawing characters, mathematical symbols and a few Greek letters.
* The high-bit range, 0x80–0xFF, is mapped to various symbols: a few European characters (accented Latin [[vowel]]s, etc) in no particular order and not sufficient for representation of most Western European languages, box-drawing characters, mathematical symbols and a few Greek letters.

Revision as of 18:48, 8 March 2006

IBM PC or MS-DOS code page 437, often abbreviated CP437 and also known as DOS-US or OEM-US, is the original character set of the IBM PC, circa 1981. The following is a table representing CP437 using the equivalent Unicode characters:

.0 .1 .2 .3 .4 .5 .6 .7 .8 .9 .A .B .C .D .E .F
0_ NULL
0

263A

263B

2665

2666

2663

2660

2022

25D8

25CB

25D9

2642

2640

266A

266B

263C
1_
25BA

25C4

2195

203C

B6
§
A7

25AC

21A8

2191

2193

2192

2190

221F

2194

25B2

25BC
2_
20
!
21
"
22
#
23
$
24
%
25
&
26
'
27
(
28
)
29
*
2A
+
2B
,
2C
-
2D
.
2E
/
2F
3_ 0
30
1
31
2
32
3
33
4
34
5
35
6
36
7
37
8
38
9
39
:
3A
;
3B
<
3C
=
3D
>
3E
?
3F
4_ @
40
A
41
B
42
C
43
D
44
E
45
F
46
G
47
H
48
I
49
J
4A
K
4B
L
4C
M
4D
N
4E
O
4F
5_ P
50
Q
51
R
52
S
53
T
54
U
55
V
56
W
57
X
58
Y
59
Z
5A
[
5B
\
5C
]
5D
^
5E
_
5F
6_ `
60
a
61
b
62
c
63
d
64
e
65
f
66
g
67
h
68
i
69
j
6A
k
6B
l
6C
m
6D
n
6E
o
6F
7_ p
70
q
71
r
72
s
73
t
74
u
75
v
76
w
77
x
78
y
79
z
7A
{
7B
|
7C
}
7D
~
7E

2302
8_ Ç
C7
ü
FC
é
E9
â
E2
ä
E4
à
E0
å
E5
ç
E7
ê
EA
ë
EB
è
E8
ï
EF
î
EE
ì
EC
Ä
C4
Å
C5
9_ É
C9
æ
E6
Æ
C6
ô
F4
ö
F6
ò
F2
û
FB
ù
F9
ÿ
FF
Ö
D6
Ü
DC
¢
A2
£
A3
¥
A5

20A7
ƒ
192
A_ á
E1
í
ED
ó
F3
ú
FA
ñ
F1
Ñ
D1
ª
AA
º
BA
¿
BF

2310
¬
AC
½
BD
¼
BC
¡
A1
«
AB
»
BB
B_
2591

2592

2593

2502

2524

2561

2562

2556

2555

2563

2551

2557

255D

255C

255B

2510
C_
2514

2534

252C

251C

2500

253C

255E

255F

255A

2554

2569

2566

2560

2550

256C

2567
D_
2568

2564

2565

2559

2558

2552

2553

256B

256A

2518

250C

2588

2584

258C

2590

2580
E_ α
3B1
ß
DF
Γ
393
π
3C0
Σ
3A3
σ
3C3
µ
B5
τ
3C4
Φ
3A6
Θ
398
Ω
3A9
δ
3B4

221E
φ
3C6
ε
3B5

2229
F_
2261
±
B1

2265

2264

2320

2321
÷
F7

2248
°
B0

2219
·
B7

221A

207F
²
B2

25A0
 
A0

It is based on ASCII, with the following modifications:

  • The C0 control range (0x00–0x1F hex) is mapped to graphics characters. The codes can assume their original function as controls (as they still do—typing "echo", space, control-G and then Enter causes the PC speaker to emit a beep—even on the command prompt on Windows XP), but when placed in display ram, for example in a screen editor like MS-DOS edit, they show as graphics. The graphics are various, such as smiling faces, card suits and musical notes. Code 0x7F, DEL, similarly shows as a graphic (a house).
  • The high-bit range, 0x80–0xFF, is mapped to various symbols: a few European characters (accented Latin vowels, etc) in no particular order and not sufficient for representation of most Western European languages, box-drawing characters, mathematical symbols and a few Greek letters.

The repertoire of CP437 was taken from the character set of Wang word-processing machines, according to Bill Gates in an interview with Gates and Paul Allen that in the 2nd of October 1995 edition of Fortune Magazine:

"… we were also fascinated by dedicated word processors from Wang, because we believed that general-purpose machines could do that just as well. That's why, when it came time to design the keyboard for the IBM PC, we put the funny Wang character set into the machine—you know, smiley faces and boxes and triangles and stuff. We were thinking we'd like to do a clone of Wang word-processing software someday."

CP437 is inadequate for internationalisation, as it lacks characters necessary for some languages, such as À (capital A with grave) for French, and has only a few Greek letters. Later MS-DOS character sets, such as CP850 (DOS Latin-1), CP852 (DOS Central-European) and CP737 (DOS Greek), filled the gaps for international use while still being nearly compatible with CP437 by retaining most of the box-drawing characters. All CP437 characters are in Unicode and in Microsoft's WGL4 character set, therefore in most of the fonts on Microsoft Windows, and also in the VGA font of Linux, and the ISO 10646 fonts for X11.

Implementors of mapping tables to Unicode should note that CP437 unifies some characters: 0xE1 is both the German sharp S (U+00DF, ß) and the Greek lowercase beta (U+03B2, β); 0xE4 is both the n-ary summation sign (U+2211, ∑) and the Greek uppercase sigma (U+03A3, Σ); 0xE6 is both the micro sign (U+00B5, µ) and the Greek lowercase mu (U+03BC, μ); 0xEA is both the Ohm sign (U+2126, Ω) and the Greek uppercase omega (U+03A9, Ω); and 0xEE is both the element-of sign (U+2208, ∈) and the Greek lowercase epsilon (U+03B5, ε).

See also

External links