Tokens in C++

In this article, we will learn about tokens and its types in C++. Let’s start!!

What is Token in C++?

A token is the smallest unit of a program that the compiler understands. In C++, Tokens are divided into

  • Keywords
  • Identifiers
  • Constants
  • Strings
  • Special Symbols
  • Operators

Before moving ahead to the types, let’s first talk about character set in C++.

Character Set in C++

The input and output of a program are made up of characters and symbols. C++ Character Set is a collection of characters and symbols that have a specified meaning. It consists of:

  • Letters: uppercase (A-Z) and lowercase (a-z) alphabets.
  • Digits: Numbers from 0-9.
  • Special characters: There are various special characters like , ; \ _ ? : etc.
  • White spaces: blank space, new line, horizontal tab and carriage return.

Types of Tokens in C++

1. Keywords in C++

In a programming language, there are certain reserved words that have fixed meaning. These words are known as keywords. We cannot use a keyword for purposes other than what it’s reserved for.

There are 32 keywords supported in C language.

auto            double   int             struct

break          else        long          switch

case             enum     register  typedef

char            extern    return     union

const           float       short      unsigned

continue    for          signed     void

default        goto       sizeof      volatile

do                 if             static      while

C++ supports these 32 keywords and 31 additional keywords that are not available in C.

asm                bool                catch                   class

const_cast   delete            dynamic_cast   explicit 

export           false               friend                   inline 

mutable       namespace   new                      operator 

private          protected     public                  reinterpret_cast

static_cast   template      this                       throw

true                try                  typeid                   typename 

using              virtual          wchar_t

2. Identifiers in C++

We can give names to various elements of a program like variables, functions, structures, etc. These user-defined names are called identifiers. Identifiers must be unique as we use them in the program execution.

We need to follow some rules for naming identifiers:

  • An identifier name should begin with a letter or an underscore ( _ ).
  • An identifier name should be made up of letters, digits and underscores only.
  • Special characters and white spaces are not allowed.
  • No keyword can be used as an identifier.
  • Identifier names are case sensitive.
  • Length of an identifier name should not exceed 31 characters, beyond which it becomes insignificant.

Examples of some valid identifiers are: Name, age, add_numbers, _students, etc.
Examples of some invalid identifiers are: Name@, 6age, new, etc.

3. Constants in C++

Constants are expressions whose values remain fixed. Once defined, we cannot change the value of a constant. These can also be referred to as literals.

Constants can be of integer, floating-point, character, string or boolean data types.

For example,
const float pi = 3.14;

Here, the value of constant pi cannot be changed in the entire program.

4. Strings in C++

A string stores a sequence of characters. It terminates with a null character ‘\0’. Unlike characters, strings in C++ are always enclosed within double quotes (” “).

In C++, there are two types of strings:

a. C-style strings
Example – char name[ ] = “TechVidvan”;

b. Objects of the string class in the Standard C++ Library
Example – string name = “TechVidvan”;

5. Special Symbols in C++

There are some special symbols in C++ that have special meaning to the compiler. We cannot alter their meaning. List of special symbols in C++ are:

Special Symbol Name Use
[ ] Square Brackets Used for single dimensional and multidimensional subscripts of arrays.
() Parentheses Used for function calls and parameters.
{ } Curly braces Used to indicate the beginning and end of a code block.
, Comma Used to separate multiple statements like parameters in a function.
: Colon Used to invoke an initialization list.
; Semicolon Also called statement terminator, it is used to mark the end of statements. 
* Asterisk Used to create pointers.
# Hash/ Preprocessor Used as a preprocessor directive to include header files and define constants.
. Dot Used to access a structure member.
~ Tilde Used as a destructor.

6. Operators in C++

Operators are symbols that operate on operands. These operands can be variables or values. Operators help us to perform mathematical and logical computations.

Operators in C++ are classified into following types based on the number of operands they operate on:

a. Unary Operators: These act upon one operand. For example, increment operator (++).
b. Binary Operators: They operate on two operands. For example, addition operator (+).
c. Ternary Operator: There is a ternary operator in C++ that acts on three operands. It is the ?: conditional operator.

On the basis of nature of operation, operators in C++ are classified into following six types:

  • Arithmetic
  • Assignment
  • Relational
  • Logical
  • Bitwise
  • Other Operators

Summary

Now, we understand what tokens are in C++. It is the smallest unit of a program that the compiler understands. Tokens are classified into Keywords, Identifiers, Constants, Strings, Special symbols and Operators.