If this helped you, please share!

Python comprehension fun!

Published November 15, 2018 in programming - 0 Comments

List comprehensions are an extremely useful and optimized idiomatic Python language feature for manipulating and returning data stored in lists (or any iterable type). The Python 3 docs describe how to use basic and nested list comprehensions.

Advanced nested list comprehensions

If I wanted to clean some text by removing stop words and generate lists of word tokens, I can process each sentence in the text list and check its contents against stop_words:

The inner list comprehension operates on the output of the split function called on each item in the text list.

It’s also easy to flatten the list of lists returned by the nested list comprehension using the Python standard library itertools.chain function:

The flattened list is:

Try to avoid writing for loops for simple list tasks. Save for loops for complicated list operations where using comprehensions and built-in functions is not easy or straightforward.

Set comprehensions

Besides lists, we can use comprehensions to build sets from arbitrary data. Here is a simple contrived example:

The test_set variable contains the even integers from test_list without repetition:

Dict comprehensions

We can also use comprehensions to build dictionaries from arbitrary data. For example, if I wanted to build a lookup table for a list of ordinal data categories:

The result is:

Two lists can also be joined into a dictionary; using the zip function for example:

The result is:

One thing to watch out for when using zip is input lists are expected to have the same length. The function will iterate until the shortest input list is exhausted; remaining items from longer lists will be ignored. The itertools.zip_longest function works well when list sizes are not the same and using a fill value is appropriate.

Dict and set comprehensions were introduced in Python 2.7 through PEP 274.

Generator expressions

Here’s where comprehensions get very interesting. Generator expressions were introduced in PEP 289 and have similar syntax to list, set and dict comprehensions. Instead of returning concrete collection types however, a generator expression creates an iterable that can be used as input to functions that expect an iterable as input:

This code returns the maximum integer in test_list. Explicitly creating and iterating over a list comprehension is not needed:

Here, we are computing log values from a list of integers. If log is undefined for a given input, NaN is used instead:

The result is:

Generator expressions used in conjunction with next is an efficient way to filter and return the first match in a collection:

The first list item that matches is 2:

If the generator expression does not have a conditional statement, then next will simply return the first value generated by the expression.

Here, the generator is exhausted and raises the StopIteration exception since there is no list item less than one:

The result is:

The next function can also return a default value. In this case, the generator expression needs to be in parentheses:

The result is:

PEP 289 is worth reading to dive even deeper into generator expressions.

Tags: python

No comments yet

Leave a Reply: