an illustration of different collections in Python.

A collection is not a strange concept. It is just a bunch of objects put together for a certain reason. You may have a collection of pictures, songs, books, movies, stamps… almost anything. Collections in programming and collections in Python follows the same line of idea. They are objects of some types put together for ease of accesses and uses for later. There are many types of collections in Python. In this post, I will briefly describe three of them, namely lists, tuples, and dictionaries. You can download the complete notebook here.

Lists in Python

Lists are called ordered collections. This means items in a list are in orders, and we can access them using their indexes. First, let us talk about how to create a list. If you put stuffs together, separated by commas ,, and wrapped them inside a pair of square brackets [], you have just got yourself a list. More formally, the syntax is as follows

In [ ]:

list_name = [<val_1>, <val_2>, ...]

with list_name being the variable that will store the list, and var_1, var_2,… being the items you want to put into the list. In Python, these values can be pretty much anything. Furthermore, you can have as many items in a list as you want (as long as the computer has enough memory). And, a list can contains variables, as long as you have defined them before hand. If you access the variable that refers to a list, you will get all items.

In [2]:

#a list of number
a_list = [1,2,3,4,5]

#we can use print() to print out all items in a list
print(a_list)

[1, 2, 3, 4, 5]

In [3]:

#a list of string
another_list = ['a','b','c','d','e']
print(another_list)

['a', 'b', 'c', 'd', 'e']

In [6]:

#a list of variables, remember, you need to create them first 
x = 10
y = 20
z = 30

list_4 = [x,y,z]

print(list_4)

[10, 20, 30]

Indexing in lists

Lists index items using integer numbers starting from 0. You can access an item using its index value. The syntax is list_name[<item_index>]. For example

In [2]:

a_list = [10,4,6,6,12,61,78,34,90,73]

print(a_list[0])
print(a_list[1])
print(a_list[5])
print(a_list[9])

In the code above, a_list[0] takes the first item in the list a_list, which is 10, and results in the first print() displaying 10 as output. Similarly, a_list[5] and a_list[9] result in the sixth and tenth (also last) items, which are 61 and 73. So, indexes of items in a list starts from 0 and ends at list size - 1. Additionally, Python has a negative index system to access a list from the end, which starts from -1 as the last item and ends at - list size for the first item. The cell below showcases some negative indexes.

In [25]:

print(a_list[-1])
print(a_list[-2])
print(a_list[-5])

73
90
61

Slicing lists

Index let us access individual items in a list. For multiple items, we use the slicing technique. The syntax of slicing is list_name[start:stop:step]. Here, the start:stop:step syntax generate an index sequence very similar to how range() does:
– Consists of integer number
– Begins from start
– Ends as close as possible to stop
– Incremented by step.
You can omit either start or stop which implies slicing the list from the beginning or until the end. Omitting step means increment is 1. The few cells below demonstrate slicing of a_list create previously.

In [26]:

a_list[0:5]

Out[26]:

[10, 4, 6, 6, 12]

In [27]:

a_list[3:9]

Out[27]:

[6, 12, 61, 78, 34, 90]

In [28]:

a_list[2:-1]

Out[28]:

[6, 6, 12, 61, 78, 34, 90]

In [40]:

a_list[1:-2:2]

Out[40]:

[4, 6, 61, 34]

In [42]:

a_list[8:3:-2]

Out[42]:

[90, 78, 12]

In [41]:

a_list[::-1]

Out[41]:

[73, 90, 34, 78, 61, 12, 6, 6, 4, 10]

As you can see, step can be negative, in which case, we will slice the list from the end back to the beginning.

Lists and for loop

Lists match nicely with for loop in that we can use a for loop to iterate through each item in a list. We do that by simply replace the range() part in a for loop the list’s name. Besides printing, you can use accumulator to get the sum of all items. For examples,

In [30]:

for item in a_list:
    print(item)

In [31]:

accumulator = 0

for item in a_list:
    accumulator = accumulator + item

print(accumulator)

Tuples in Python

Tuples are similar to list in that they both store items in order. You can also access items in tuples using indexes and slices. The difference between tuples and lists is that lists are mutable and tuples are immutable. More specifically, you can change items in a list after creating it, but items in a tuple cannot be modified. The cells below demonstrate the mutability and immutability of lists and tuples. By the way, creating tuples is just like creating lists but simply replacing the brackets [] with parentheses ().

In [96]:

a_list = [10,20,30]
a_list[1] = 100
print(a_list)

[10, 100, 30]

In [97]:

a_tuple = (10,20,30)
print(a_tuple)

(10, 20, 30)

In [99]:

a_tuple[1] = 100

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-99-91bc41c29b37> in <module>
----> 1 a_tuple[1] = 100

TypeError: 'tuple' object does not support item assignment

Due to their immutability, tuples are safer to use because you cannot accidently change their contents. They are also a bit faster than lists. So, depending on your needs, you can choose between lists and tuples.

Dictionaries in Python

Dictionaries are another type of collections in Python which store elements that are called key-value pairs. A key-value pair is similar to a word and its meaning that you can look up in a real dictionary. With Python dictionaries, you can use a key to locate and obtain a specific value. Dictionaries do not order their items, so we cannot use indexes and slices on them. The syntax to create a dictionary is as follows

In [ ]:

dictionary_name = {
 key_1 : value_1,
 key_2 : value_2,
 ...
}

In a key:value pair, the key is usually a string (it can be something else, but that is a story for another day), and the value can be pretty much anything. Like lists, you can have as many items in dictionaries as you want. To obtain the value of a key, we use the syntax dictionary_name[key] which is similar to a list, but the “index” is a key. Below are some examples of creating a dictionary and accessing its items.

In [28]:

state_capitals={
    'New York': 'Albany',
    'New Jersey': 'Trenton',
    'Georgia' : 'Atlanta',
    'Texas' : 'Austin',
    'Washington' : 'Olympia'
}
state_capitals

Out[28]:

{'New York': 'Albany',
 'New Jersey': 'Trenton',
 'Georgia': 'Atlanta',
 'Texas': 'Austin',
 'Washington': 'Olympia'}

In [11]:

print(state_capitals)

{'New York': 'Albany', 'New Jersey': 'Trenton', 'Georgia': 'Atlanta', 'Texas': 'Austin'}

In [4]:

state_capitals['New York']

Out[4]:

'Albany'

In [5]:

state_capitals['Georgia']

Out[5]:

'Atlanta'

Finally, we can add new key-value pairs to a dictionary by the syntax dictionary_name[new_key] = new_value. Be careful though, because if new_key is already in the dictionary, the old value will be overwritten. For example

In [28]:

state_capitals['Florida'] = 'Tallahassee'
state_capitals['Alabama'] = 'Montgomery'

In [30]:

state_capitals['Florida'] 

Out[30]:

'Tallahassee'

In [32]:

state_capitals['New York'] = "I don't know!"

In [33]:

state_capitals['New York']

Out[33]:

"I don't know!"

Conclusion

In this post, I discussed three common types of collections in Python, lists, tuples, and dictionaries. You will see them pretty often in data analysis, so understanding them from now will be helpful. Also, the concepts of indexing and slicing are highly important because we do that on data sets as well. Please do practice with these two skills until you are comfortable. I will stop this long post now, so see you next time!

Collections in Python

Lists in Python

Indexing in lists

Slicing lists

Lists and for loop

Tuples in Python

Dictionaries in Python

Conclusion

2 Comments