Containers in Python
Course Lessons
S.No | Lesson Title |
---|---|
1 | Introduction |
2 | Types of Containers |
2.1 | Lists |
2.3 | Tuples |
2.3 | Dictionary |
2.4 | Sets |
3 | When to Use What |
4 | Conclusion |
Introduction
Programming languages have been there for ages now and every decade we see a new language that becomes widely popular. As a data scientist, the first thing you need to do is master a programming language. We can certainly say that right now python is topping the charts when it comes to its use in data science-related applications. In this article, we'll try to understand the reason behind the popularity of python among data scientists. Let's get started.
Type of Containers
There are pre-defined container classes in python that are frequently used. In the following sections, you'll see different types of containers and how easy it is to manipulate them.
Lists
A list represents an ordered, mutable collection of objects. You can mix and match any type of object in a list, we can add and remove objects from a list. For creating empty lists we can either call the list() function or use square brackets as follows:
l=[]
l=list()
We can initialize a list with the content of any type using the same square bracket notation which we used above. The list() function also takes an iterable as a single argument and returns a shallow copy of that iterable as a new list. A list is one such iterable as we'll see now.
l=['a','b','c']
l2 = list(l)
print(l2)
Output:
['a','b','c']
A Python string is a sequence of characters and can be treated as an iterable over those characters. If we combine it with the list() function, a new list of the characters can be generated.
list('name')
Output:
['n','a','m','e']
l = []
l.append('b')
l.append('c')
l.append(0,'a')
Output:
['a','b','c']
It is easy to iterate over lists using the 'for' statement. Each element in the iterable structure is assigned once to the "loop variable" for a single pass of the loop during which the block enclosed is executed.
for i in l:
print(l)
Iterating while using a while loop is also possible. Generally, a while loop is used to perform iterations of unknown lengths while checking conditions in each iteration or using a break statement when a condition is met. We are using list.pop() to ensure no list items are repeated when we loop through them.
l=['a','b','c']
while(len(l)):
print(l.pop())
Output:
['a','b','c']
The next important functionality is accessing different list elements and slicing. It is simple to access the elements of a list and is similar to accessing array elements in other languages. Another useful addition is the use of negative indexing where l[-1] gives us the last element of list l.
Slices is another important extension of subscripting syntax. They are marked about single or doubles colons (:) inside the square bracket subscript. In the single colon format, the first number from the left represents the starting index (inclusive), and the number after the colon represents the ending index (exclusive). In l[:2] the first index is omitted which means the slice starts from the very first element and in l[2:] slice starts from the 2nd element, ending at the last element.
In double colon format, the first two numbers from left represent the indices while the number after the second colon represents stride. For example l[::2] would take every second item from the list.
l = list('abcde')
print(l)
print(l[3])
print(l[-3])
print(l[1:3])
print(l[1:-1])
Output:
['a','b','c','d','e']
'd'
'c'
['b','c']
['b','c','d']
In order to find an object inside a list, we can use the 'in' keyword. The index() method returns the actual location of an object in the list.
l = list('abc')
print('a' in l)
print(l.index('c'))
Output:
True
2
Tuples
Tuples are similar to lists with the only difference being that they are immutable. They are slightly faster and smaller than a list. They can be created in a similar way to how we create a list. They can also be initialized with values. Another way of creating tuples is bypassing an iterable object to tuple function.
t=()
t=tuple()
A very common way of accessing the elements of a tuple in python is called unpacking. It can be used on lists as well but more commonly used with tuples as it requires us to know the size of the container. Unpacking is done by assigning a list of variables whose size is equal to the number of elements in the tuple. All the variables are individually assigned values from the tuple.
t = ('a', 'b')
a, b = t
print(a)
print(b)
Tuples can be sliced and accessed in the same way as lists. The slices you'll get will be tuples themselves. Similarly, you can iterate over the tuples in the same way you iterate over a list.
The major difference between a tuple and a list is the fact that tuples are immutable. This means that the objects in them cannot be replaced or updated with something else. If you try to do this you'll get an error. This makes them safe to store data as it cannot be modified easily.
Dictionary
Dictionaries in Python are similar to an implementation of key-value mapping that can also go by the name of "hashtable" or "associative array" in another language. They are one of the most important inbuilt container classes in python. Dictionaries are created in the same way as tuple or list using braces '{}'. You can also use dict() built-in function that accepts a set of keyword arguments.
char = {1:'a', 2:'b', 3:'c'}
char = {}
char[1] = 'a'
char[2]= 'b'
char[3]= 'c'
Dictionary values can be iterated over by using keys as follows. You can also use the keys() method to get the keys of the dictionary.
for key in char:
char[key]
for key in char.keys():
char[key]
New values can be entered as key-value pairs and old values can be modified using the keys.
char[4] = 'd' #adding new value char[2] = 'B' #changing old value
If we try to find the values using a key that is not there in the dictionary then KeyError is raised. Another important thing about dictionaries are keys should be immutable i.e. you cannot use a list data type as a key but you can use a tuple as a key. Dictionaries cannot be used as keys as they are mutable.
Another important method used with dictionaries is items(). It returns a list containing tuples of (key, value) pairs.
Sets
A set is an unordered, unique, mutable collection of objects. It is designed in a way to reflect the actual properties of a set. A set is created by using the set() function on an iterable object.
l = ['a', 'b', 'a', 'c']
s = set(l)
print(s)
Output:
set(['a', 'b', 'c'])
Notice how only one 'a' is retained after we convert the list into a set. We can use the pop() function similar to lists.
Another important property of sets is set functions. These methods can be directly called by their names as is shown in the following snippet.
S1 = set(['a', 'b', 'c', 'b'])
S2 = set(['d', 'e', 'f', 'e', 'a'])
print(S1)
print(S2)
print(S1.union(S2))
print(S1.intersection(S2))
Output:
set(['a', 'b', 'c'])
set(['d', 'e', 'f', 'a'])
set(['a', 'b', 'c', 'd', 'e', 'f'])
set(['a'])
When to Use What
The biggest dilemma while using one of the container classes is how to pick the correct container class for the data at hand. Following are a few points that can be helpful while selecting one of the classes:
- When you don't want the data to be altered by anyone, use tuples as they are immutable.
- If the data is in the form of key-value pairs then the obvious approach is to store it using a dictionary. The mistake many people make is that they sometimes end up making different lists or tuples for each key and its corresponding values.
- If you want to perform set functions on your data in the future then use sets as your container.
- If you want to perform all sorts of slicing, replacements, and related operations on your data then go with lists.
Conclusion
The idea of in-built container classes with different functionalities in python makes it easy while working with certain data types. There are many more containers not commonly used but can be explored after going through the basic ones discussed in this article. Happy learning!