Character constant
Contents |
Syntax
'
c-char
'
|
(1) | ||||||||
u8'
c-char
'
|
(2) | (since C23) | |||||||
u'
c-char
'
|
(3) | (since C11) | |||||||
U'
c-char
'
|
(4) | (since C11) | |||||||
L'
c-char
'
|
(5) | ||||||||
'
c-char-sequence
'
|
(6) | ||||||||
L'
c-char-sequence
'
|
(7) | ||||||||
u'
c-char-sequence
'
|
(8) | (since C11) (removed in C23) | |||||||
U'
c-char-sequence
'
|
(9) | (since C11) (removed in C23) | |||||||
where
- c-char is either
-
-
a character from the basic source character set minus single-quote (
'), backslash (\), or the newline character. - escape sequence: one of special character escapes \ ' \ " \ ? \\ \a \b \f \n \r \t \v , hex escapes \x... or octal escapes \... as defined in escape sequences .
-
a character from the basic source character set minus single-quote (
|
(since C99) |
- c-char-sequence is a sequence of two or more c-char s.
|
3)
16-bit wide character constant, e.g.
u
'貓'
, but not
u
'🍌'
(
u
'
\U0001f34c
'
). Such constant has type
char16_t
and a value equal to the value of
c-char
in the 16-bit encoding produced by
mbrtoc16
(normally UTF-16). If
c-char
is not representable or maps to more than one 16-bit character, the value is implementation-defined.
4)
32-bit wide character constant, e.g.
U
'貓'
or
U
'🍌'
. Such constant has type
char32_t
and a value equal to the value of
c-char
in in the 32-bit encoding produced by
mbrtoc32
(normally UTF-32). If
c-char
is not representable or maps to more than one 32-bit character, the value is implementation-defined.
|
(until C23) |
|
3)
UTF-16 character constant, e.g.
u
'貓'
, but not
u
'🍌'
(
u
'
\U0001f34c
'
). Such constant has type
char16_t
and the value equal to ISO 10646 code point value of
c-char
, provided that the code point value is representable with a single UTF-16 code unit (that is,
c-char
is in the range 0x0-0xD7FF or 0xE000-0xFFFF, inclusive). If
c-char
is not representable with a single UTF-16 code unit, the program is ill-formed.
4)
UTF-32 character constant, e.g.
U
'貓'
or
U
'🍌'
. Such constant has type
char32_t
and the value equal to ISO 10646 code point value of
c-char
, provided that the code point value is representable with a single UTF-32 code unit (that is,
c-char
is in the range 0x0-0xD7FF or 0xE000-0x10FFFF, inclusive). If
c-char
is not representable with a single UTF-32 code unit, the program is ill-formed.
|
(since C23) |
Notes
Multicharacter constants were inherited by C from the B programming language. Although not specified by the C standard, most compilers (MSVC is a notable exception) implement multicharacter constants as specified in B: the values of each char in the constant initialize successive bytes of the resulting integer, in big-endian zero-padded right-adjusted order, e.g. the value of ' \1 ' is 0x00000001 and the value of ' \1 \2 \3 \4 ' is 0x01020304 .
In C++, encodable ordinary character literals have type char , rather than int .
Unlike integer constants , a character constant may have a negative value if char is signed: on such implementations ' \xFF ' is an int with the value - 1 .
When used in a controlling expression of #if or #elif , character constants may be interpreted in terms of the source character set, the execution character set, or some other implementation-defined character set.
16/32-bit multicharacter constants are not widely supported and removed in C23. Some common implementations (e.g. clang) do not accept them at all.
Example
#include <stddef.h> #include <stdio.h> #include <uchar.h> int main(void) { printf("constant value \n"); printf("-------- ----------\n"); // integer character constants, int c1='a'; printf("'a':\t %#010x\n", c1); int c2='🍌'; printf("'🍌':\t %#010x\n\n", c2); // implementation-defined // multicharacter constant int c3='ab'; printf("'ab':\t %#010x\n\n", c3); // implementation-defined // 16-bit wide character constants char16_t uc1 = u'a'; printf("'a':\t %#010x\n", (int)uc1); char16_t uc2 = u'¢'; printf("'¢':\t %#010x\n", (int)uc2); char16_t uc3 = u'猫'; printf("'猫':\t %#010x\n", (int)uc3); // implementation-defined (🍌 maps to two 16-bit characters) char16_t uc4 = u'🍌'; printf("'🍌':\t %#010x\n\n", (int)uc4); // 32-bit wide character constants char32_t Uc1 = U'a'; printf("'a':\t %#010x\n", (int)Uc1); char32_t Uc2 = U'¢'; printf("'¢':\t %#010x\n", (int)Uc2); char32_t Uc3 = U'猫'; printf("'猫':\t %#010x\n", (int)Uc3); char32_t Uc4 = U'🍌'; printf("'🍌':\t %#010x\n\n", (int)Uc4); // wide character constants wchar_t wc1 = L'a'; printf("'a':\t %#010x\n", (int)wc1); wchar_t wc2 = L'¢'; printf("'¢':\t %#010x\n", (int)wc2); wchar_t wc3 = L'猫'; printf("'猫':\t %#010x\n", (int)wc3); wchar_t wc4 = L'🍌'; printf("'🍌':\t %#010x\n\n", (int)wc4); }
Possible output:
constant value -------- ---------- 'a': 0x00000061 '🍌': 0xf09f8d8c 'ab': 0x00006162 'a': 0x00000061 '¢': 0x000000a2 '猫': 0x0000732b '🍌': 0x0000df4c 'a': 0x00000061 '¢': 0x000000a2 '猫': 0x0000732b '🍌': 0x0001f34c 'a': 0x00000061 '¢': 0x000000a2 '猫': 0x0000732b '🍌': 0x0001f34c
References
- C23 standard (ISO/IEC 9899:2024):
-
- 6.4.4.5 Character constants (p: 63-66)
- C17 standard (ISO/IEC 9899:2018):
-
- 6.4.4.4 Character constants (p: 48-50)
- C11 standard (ISO/IEC 9899:2011):
-
- 6.4.4.4 Character constants (p: 67-70)
- C99 standard (ISO/IEC 9899:1999):
-
- 6.4.4.4 Character constants (p: 59-61)
- C89/C90 standard (ISO/IEC 9899:1990):
-
- 3.1.3.4 Character constants
See also
|
C++ documentation
for
Character literal
|