Python data types

Warning
This article was last updated on 11.05.2022, the content may be out of date.

Speaking of data types, we should mention that Python is a language with inferred strong dynamic typing (More about type systems).

Buit-in and modules
Python provides some built-in data types and a variety of specialized data type from modules.
Integer is more than just an Integer
The standard Python implementation is written in C. This means that every Python object is simply a cleverly-disguised C structure, which contains not only its value, but other information as well.

Every value in Python has a datatype. Since everything is an object in Python programming, data types are actually classes and variables are instance (object) of these classes.

At a hardware level, a variable is a reference to a location in memory.

There are three distinct numeric types: int, float, complex

int - or integer, is a whole number, positive or negative, without decimals, of unlimited length (up to the maximum available memory of the system).

1
2
3
# integer
a = 1000
type(a)  # <class 'int'>

float - or floating point number is a number, positive or negative, containing a decimal point or an exponent sign “e” (indicate the power of 10: 35e3, 12E4). It is accurate up to 15 decimal places.

float includes nan and inf/-inf.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
# float
b = 12.5634423  # or 35e3, 12E4, -87.7e100
type(b)  # <class 'float'>

# Not A Number
x = float("nan")
print(x)  # nan
type(x)  # <class 'float'>

# Similarly with infinity
y = float("inf")  # inf
z = float("-inf")  # -inf

complex - or complex numbers (x+yj, where x - real part, y - imaginary part and j - imaginary unit).

1
2
3
4
5
# complex
c = 3+5j  # 5j, -5j
type(c)  # <class 'complex'>
print(c.real)  # 3.0
print(c.imag)  # 5.0
Python3.x

long - it is used to hold longer integers.

In addition, Booleans are a subtype of integers.

Booleans represent one of two values: True or False (constant objects, case sensitive).

1
2
type(True)  # <class 'bool'>
type(False)  # <class 'bool'>

The bool() function allows you to evaluate any value, and return True or False.

True and false value it’s not only the True and False.

True value is:

  • any non-zero number
  • any non-empty string
  • any non-empty object

False value is:

  • 0
  • None
  • empty string
  • empty object

There are three basic sequence types: str, list, tuple, range and binary sequence types: bytes, bytearray, memoryview.

Like all data types, sequences can be mutable or immutable.

This operations are supported by most sequence types, both mutable and immutable.:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
x in s  # (x not in s) - containment testing
s + t  # the concatenation of s and t
s * n  # equivalent to adding s to itself n times
s[i]  # get by index
s[i:j:k]  # slices (start:stop:step)
len(s)  # length of s
min(s)  # smallest item of s
max(s)  # largest item of s
s.index(x[, i[, j]])  # index of the first occurrence of x in s (at or after index i and before index j)
s.count(x)  # total number of occurrences of x in s

hash() - the only operation that immutable sequence types generally implement that is not also implemented by mutable sequence types. This support allows immutable sequences, such as tuple instances, to be used as dict keys and stored in set and frozenset instances.

Operations supported by mutable sequence types:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
s[i] = x
s[i:j] = t
del s[i:j]
s[i:j:k] = t
del s[i:j:k]
s.append(x)
s.clear()
s.copy()
s.extend(t)  # or s += t
s *= n
s.insert(i, x)
s.pop()  # or s.pop(i)
s.remove(x)
s.reverse()

Strings are immutable ordered sequences of Unicode code points. In python there is no character data type (char), a character is a string of length one. String literals are written in a variety of ways:

1
2
3
4
5
6
7
8
9
# Single quotes
first_way = 'This is a string'

# Double quotes
second_way = "This is a string"

# Triple quotes (single or double)
third_way_single = '''This is a string'''
third_way_double = """This is a string"""

Strings implement all of the common sequence operations, along with the additional methods of class str().

Lists are mutable ordered sequences, typically used to store collections of homogeneous items. Since lists are indexed, lists can have items with the same value.

1
2
3
list1 = [10, 20, 30, 77, 77]
list2 = ['one', 'dog', 'seven']
list3 = [1, 20, 4.0, 'word']

Tuples are immutable ordered sequences, typically used to store collections of heterogeneous data. Tuples are also used for cases where an immutable sequence of homogeneous data is needed (such as allowing storage in a set or dict instance).

Tuples are generally faster than the list data type in Python.

1
2
3
4
5
tuple1 = (1, 2, 5, 6)
tuple2 = tuple('a', 'b', 'c')
tuple3 = ()  # empty tuple
tuple4 = 23, 13, 100
tuple5 = ("London", "Tokyo", "Korea", 1986, 1640, 1948)

The range type represents an immutable ordered sequence of numbers and is commonly used for looping a specific number of times in for loops.

Ranges implement all of the common sequence operations except concatenation and repetition.

Testing range objects for equality with == and != compares them as sequences.

Python 3.3 The start, stop and step attributes.

Bytes objects are immutable ordered sequences of single bytes. The syntax for bytes literals is largely the same as that for string literals, except that a b prefix is added.

Bytearray objects are a mutable counterpart to bytes objects.

Memoryview objects allow Python code to access the internal data of an object that supports the buffer protocol without copying.

A set object is an unordered collection of distinct hashable objects. Common uses include membership testing, removing duplicates from a sequence, and computing mathematical operations such as intersection, union, difference, and symmetric difference.

Being an unordered collection, sets do not record element position or order of insertion. Accordingly, sets do not support indexing, slicing, or other sequence-like behavior.

There are currently two built-in set types, set and frozenset.

The set type is mutable — the contents can be changed using methods like add() and remove(). Since it is mutable, it has no hash value and cannot be used as either a dictionary key or as an element of another set.

The frozenset type is immutable and hashable — its contents cannot be altered after it is created. It can therefore be used as a dictionary key or as an element of another set.

A dictionary is ordered, mutable collection of key: value pairs which do not allow key duplicates.

  • Dictionary keys are arbitrary, immutable (hashable) values.
  • The values in dictionary items can be of any data type.

Python 3.7 Dictionaries are ordered.

Python 3.8 Dictionaries and dictionary view objects (dict.keys(), dict.values(), dict.items()) are now reversible.

Dictionaries can be created by several means:

1
2
3
4
5
6
7
8
9
# Use a comma-separated list of key: value pairs within braces
dict1 = {'Moscow': 1023, 'SPB': 2048}

# Use a dict comprehension
dict2 = {x: x ** 2 for x in range(10)}

# Use the type constructor
dict3 = dict([('foo', 100), ('bar', 200)])
dict4 = dict(foo=100, bar=200)

In other programming languages, a dictionary-like data type might be called an associative array, hash, or hash table.

Sources:

Python documentation - Built-in Types

Understand How Much Memory Your Python Objects Use

Understanding Data Types in Python