C++ : Introduction and Character Set

12-07-2020
C++ character set

C++ (Pronounced as C Plus Plus) is an object-oriented high-level programming language. C++ was developed at AT&T Bell Laboratories in the early 1980s.

C++ is a superset of C which has more facilities than C including the classes, inheritance, function overloading, and operator overloading.

Let’s start the learning of C++ from its character set.

C++ Character set

Like any other programming language, C++ has also its own character set. These characters are the fundamental units of the C++ language and only these characters or symbols are recognized by the language.

The character set of C++ is categorized as follows.

  1. Letters: A to Z, a to z (the English alphabets)
  2. Digits: 0 to 9 (the numerals)
  3. Special Characters: + – * / ^ | ( ) [ ] { } = ! + < > ‘ ” $ , ; : % & _ # <= >= @
  4. White Spaces: Blank space, horizontal tab, carriage return
  5. Other Characters: Non-graphic symbols and ASCII characters.

Tokens

Tokens are the smallest individual units in a program, which is similar to the word in the English language.

C++ has five types of tokens which are listed below:

  • Keywords
  • Identifiers
  • Constants
  • Punctuators
  • Operators

Keywords

The keywords are the reserved words by the language which convey a specific meaning to the compiler.

The keywords are reserved by the language for specific purposes and cannot be redefined for any other purposes.

The keywords in C++ are given below:

asm, auto, break, signed, protected, float, do, switch, volatile, unsigned, new,
private, class, const, continue, case, sizeof, integer, for, this, char, typedef, return,
inline, double, else, enum, default, struct, public, long, try, template, void,
operator, catch, friend, goto, if, exturn, virtual, static, register, union, throw,
while, short, delete

Identifiers

Identifiers are the user-defined tokens that are used to name different program elements such as memory locations, functions, statements, objects, classes etc.

Identifiers are the fundamental building blocks of the program.

Variables: The identifiers of memory locations are called variables.

Function Name: The identifiers of functions are called function name.

Labels: The identifiers of the statements are called labels.

There are some rules which should be strictly followed while constructing identifiers. They are given below.

  • An identifier is an arbitrarily long sequence of letters and digits.
  • The first character must be a letter.
  • The underscore(_) counts as a letter.
  • White spaces are not allowed.
  • Keywords cannot be used as an identifier as they have inbuilt specific meanings.
  • Upper cases and Lower cases are different because C++ is case sensitive.

Constants

Constants are the tokens that represent data items that never change their value during the program run.

The constants are also known as literals.

Integer Constants:

The tokens constituted by only the digits are called integer constants. These are the whole numbers without the fractional part.

An integer constant may contain either + or – sign as the first character which indicate whether the number is positive or negative.

Example:

+ 125, 123 are positive integer constants and -123, -45 are the negative integer constants.

Character Constants:

Any valid C++ character enclosed in a pair of single quotes ( ‘ ) is a character constant.

A character constant may not contain more than one character in the single pair quotes.

Example:

‘a’, ‘-‘, ‘2’ are valid character constants.

‘adeeb’, ‘123’ are not valid character contstants.

String Constants:

A sequence of one or more characters enclosed within a pair of double quotes is known as the string constant.

“adeeb”, “123” , “C++” are valid examples for the string constants.

When these string constants are stored in memory the last memory location (last bit) will be stored a null character(\0) to indicate the end of the string constant.

For example, the string “adeeb” is stored in memory as

a
d
e
e
 b
 \0

 

Punctuators

Punctuators are the tokens that are used to separate the tokens used in the program and hence they are also known as separators. The punctuators in C++ are the brace { and }, parenthesis ( and ), and brackets [ and ], comma (,), semicolon (;), asterisk (*), colon (:), number sign #(hash).

Operators

Operators are the tokens that trigger som kind of operations on the data.

The data on which the operations are carried out are called operands. The symbols used to operation are called operators.

For example in the operation ‘a+b’, ‘a’ and ‘b’ are operands and the symbol + is the operator.

Tagged in: