C++ Primitive Built-in Types
整理自 C++ Primer, 5th Edition
Some languages, such as Smalltalk and Python, check types at run time. In contrast, C++ is a statically typed language; type checking is done at compile time. As a consequence, the compiler must know the type of every name used in the program.
C++ allows programmers to define types that include operations as well as data. A major design goal of C++ is to let programmers define their own types that are as easy to use as the built-in types.
C++ primitive types that include:
- the arithmetic types
- a special type
void
.
Arithmetic Types
Type | Meaning | Minimum Size |
---|---|---|
bool | boolean | NA |
char | character | 8 bits |
wchar_t | wide character | 16 bits |
char16_t | Unicode character | 16 bits |
char32_t | Unicode character | 32 bits |
short | short integer | 16 bits |
int | integer | 16 bits |
long | long integer | 32 bits |
long long | long integer | 64 bits |
float | single-precision floating-point | 6 significant digits |
double | double-precision floating-point | 10 significant digits |
long double | extended-precision floating-point | 10 significant digits |
The size of—that is, the number of bits in—the arithmetic types varies across machines. The standard guarantees minimum sizes as listed in Table 2.1. However, compilers are allowed to use larger sizes for these types. Because the number of bits varies, the largest (or smallest) value that a type can represent also varies. 比如 int 在 32-bit 系统的 Visual Studio 2008 下就是 32 bits 长的。规范只规定了诸如 “int: Not smaller than short. At least 16 bits.” 这样的大规范,不同的机器、不同的编译器可以有不同的 type size。(源出:Variables and types)
也有些类型是定长的,比如 __int32
一定是 32 bits,具体见 Data Type Ranges 和 What does the C++ standard state the size of int, long type to be?。
A char
is guaranteed to be big enough to hold numeric values corresponding to the characters in the machine’s basic character set. That is, a char
is the same size as a single machine byte.
The smallest chunk of addressable memory is referred to as a “byte.” The basic unit of storage, usually a small number of bytes, is referred to as a “word.” On most machines a byte contains 8 bits and a word is either 32 or 64 bits, that is, 4 or 8 bytes.
A signed
type represents negative or positive numbers (including zero); an unsigned
type represents only values >= zero.
Unlike the other integer types, there are three distinct basic character types:
char
signed char
unsigned char
不要天真地以为一定是 (plain) char == signed char
,因为有的系统是 (plain) char == signed char
,有的系统是 (plain) char == unsigned char
,这也是由编译器说了算的。
The standard does not define how signed types are represented, but does specify that the range should be evenly divided between positive and negative values. Hence, an 8-bit signed char
is guaranteed to be able to hold values from –127 through 127; most modern machines use representations that allow values from –128 through 127.
The remaining integral types represent integer values of (potentially) different sizes. The language guarantees that an int
will be at least as large as short
, a long
at least as large as an int
, and long long
at least as large as long
. The type long long
was introduced by the new standard.
Typically, float
s are represented in one word (32 bits), double
s in two words (64 bits), and long double
s in either three or four words (96 or 128 bits). The type long double
is often used as a way to accommodate special-purpose floating-point hardware; its precision is more likely to vary from one implementation to another.
Advice: Deciding which Type to Use
C++, like C, is designed to let programs get close to the hardware when necessary.The arithmetic types are defined to cater to the peculiarities of various kinds of hardware. Accordingly, the number of arithmetic types in C++ can be bewildering. Most programmers can (and should) ignore these complexities by restricting the types they use. A few rules of thumb can be useful in deciding which type to use:
- Use an unsigned type when you know that the values cannot be negative.
- Use
int
for integer arithmetic.short
is usually too small and, in practice,long
often has the same size asint
. If your data values are larger than the minimum guaranteed size of anint
, then uselong long
. - Do not use plain
char
orbool
in arithmetic expressions. Use them only to hold characters or truth values. Computations usingchar
are especially problematic becausechar
issigned
on some machines andunsigned
on others. If you need a tiny integer, explicitly specify eithersigned char
orunsigned char
. - Use
double
for floating-point computations;float
usually does not have enough precision, and the cost of double-precision calculations versus single-precision is negligible. In fact, on some machines, double-precision operations are faster than single. The precision offered bylong double
usually is unnecessary and often entails considerable run-time cost.
Type Conversions
bool b = 42; // b is true
int i = b; // i has value 1
i = 3.14; // i has value 3
double pi = i; // pi has value 3.0
unsigned char c = -1; // assuming 8-bit chars, c has value 255
signed char c2 = 256; // assuming 8-bit chars, the value of c2 is undefined
int i = 42;
if (i) // condition will evaluate as true
i = 0;
- When we assign one of the non-
bool
arithmetic types to abool
object, the result is false if the value is 0 and true otherwise. - When we assign a
bool
to one of the other arithmetic types, the resulting value is 1 if the bool is true and 0 if the bool is false. - When we assign a floating-point value to an object of integral type, the value is truncated. The value that is stored is the part before the decimal point.
- When we assign an integral value to an object of floating-point type, the fractional part is zero. Precision may be lost if the integer has more bits than the floating-point object can accommodate.
- If we assign an out-of-range value to an object of
unsigned
type, the result is the remainder of the value modulo the number of values the target type can hold. For example, an 8-bitunsigned char
can hold values from 0 through 255, inclusive. If we assign a value outside this range, the compiler assigns the remainder of that value modulo 256. Therefore, assigning –1 to an 8-bitunsigned char
gives that object the value 255. - If we assign an out-of-range value to an object of signed type, the result is undefined. The program might appear to work, it might crash, or it might produce garbage values.
Advice: Avoid Undefined and Implementation-Defined Behavior
Undefined behavior results from errors that the compiler is not required (and sometimes is not able) to detect. Even if the code compiles, a program that executes an undefined expression is in error.
Unfortunately, programs that contain undefined behavior can appear to execute correctly in some circumstances and/or on some compilers. There is no guarantee that the same program, compiled under a different compiler or even a subsequent release of the same compiler, will continue to run correctly. Nor is there any guarantee that what works with one set of inputs will work with another.
Similarly, programs usually should avoid implementation-defined behavior, such as assuming that the size of an int
is a fixed and known value. Such programs are said to be nonportable. When the program is moved to another machine, code that relied on implementation-defined behavior may fail. Tracking down these sorts of problems in previously working programs is, mildly put, unpleasant.
Caution: Don’t Mix Signed and Unsigned Types
If we use both unsigned
and int
values in an arithmetic expression, the int
value ordinarily is converted to unsigned
.
unsigned u = 10;
int i = -42;
std::cout << i + i << std::endl; // prints -84
std::cout << u + i << std::endl; // if 32-bit ints, prints 4294967264
unsigned u1 = 42, u2 = 10;
std::cout << u1 - u2 << std::endl; // ok: result is 32
std::cout << u2 - u1 << std::endl; // ok: but the result will wrap
around
// WRONG: u can never be less than 0; the condition will always succeed
for (unsigned u = 10; u >= 0; --u)
std::cout << u << std::endl;
Expressions that mix signed and unsigned values can yield surprising results when the signed value is negative. It is essential to remember that signed values are automatically converted to unsigned. For example, in an expression like a * b
, if a
is -1 and b
is 1, then if both a
and b
are int
s, the value is, as expected -1. However, if a
is int
and b
is an unsigned
, then the value of this expression depends on how many bits an int has on the particular machine. On our machine, this expression yields 4294967295.
Comments