Split a List Into Evenly Sized Chunks in Python Using itertool

When working with lists in Python, there may be situations where you need to split a list into smaller, evenly sized chunks. This can be particularly useful when dealing with large datasets or when performing operations that require processing smaller segments of a list at a time. In this blog, we will explore how to split a list into evenly sized chunks in Python using itertool.

The itertools module provides a collection of tools for handling iterators and combining them to create efficient and expressive code. Specifically, we will be using the islice and iter functions from the itertools module to achieve our goal. These functions allow us to easily split a list into chunks of a specific size without the need for complex logic or nested loops.

By the end of this blog, you will have a solid understanding of how to leverage the itertools module in Python to split a list into evenly sized chunks. Let’s learn a Python program to split a list into evenly sized chunks using itertool.

Python Program to Split a List Into Evenly Sized Chunks Using itertool

# Split a List Into Evenly Sized Chunks in Python Using itertool

from itertools import islice

def chunk(arr_range, arr_size):
	arr_range = iter(arr_range)
	return iter(lambda: tuple(islice(arr_range, arr_size)), ())

list(chunk(range(30), 5))

Code Explanation

Importing the required Module

We begin by importing the islice function from the itertools module. This function will help us split the list into chunks of a specific size.

Define the Chunk Function

Next, we define a function called chunk that takes two arguments: arr_range (the list to be split) and arr_size (the desired size of each chunk).

Convert the List to an Iterator

Inside the chunk function, we convert the arr_range list into an iterator using the iter function. This allows us to iterate over the list and retrieve elements.

Create a Chunk Generator

We create a generator expression using the lambda keyword and the islice function. The generator expression is responsible for creating chunks of the desired size from the iterator. The islice function takes the iterator (arr_range) and the chunk size (arr_size) as arguments.

Terminate the Chunk Generation

To terminate the chunk generation process, we set an empty tuple () as the sentinel value for the iter function. This ensures that the generation stops when the iterator is exhausted.

Retrieve the Chunks

Outside the chunk function, we call the list function and pass the chunk generator to it. This retrieves all the chunks and converts them into a list.

Print the Result

Finally, we print the resulting list of chunks, which represents splitting the original list (range(30)) into chunks of size 5.

Output:

[(0, 1, 2, 3, 4),
 (5, 6, 7, 8, 9),
 (10, 11, 12, 13, 14),
 (15, 16, 17, 18, 19),
 (20, 21, 22, 23, 24),
 (25, 26, 27, 28, 29)]

The output represents the original list, range(30), split into evenly sized chunks of size 5. Each tuple in the list represents one chunk, and the elements within the tuple correspond to the values within that particular chunk. The chunks are arranged in the same order as they appear in the original list.

For example, the first chunk (0, 1, 2, 3, 4) contains the first five elements from the original list (0, 1, 2, 3, 4), the second chunk (5, 6, 7, 8, 9) contains the next five elements (5, 6, 7, 8, 9), and so on.

Here are a few alternate methods:

Using Yield:

This approach uses a generator function with the yield keyword to produce chunks one at a time. It is memory-efficient because it generates chunks on-the-fly without storing the entire result in memory. This method is useful when dealing with large lists or when memory usage is a concern.

Using for Loop:

With a for loop and slicing, we can iterate over the list and extract chunks of the desired size. This method is simple and intuitive to understand. It works well for small to medium-sized lists.

Using List Comprehension:

List comprehension offers a concise way to split a list into chunks using slicing notation. It creates a new list by iterating over the original list and extracting chunks of the desired size. This method is compact and readable, suitable for situations where code brevity is preferred.

Using NumPy:

NumPy, a powerful library for numerical computations, provides the array_split function to split arrays into evenly sized chunks. It is efficient for handling large datasets and performing complex numerical operations. This method is advantageous when dealing with numerical data and performing subsequent calculations.

Using Collections:

The deque class from the collections module can be utilized to split a list into chunks efficiently. Deque provides efficient operations for adding and removing elements from both ends. This method is useful when frequent appending or popping of elements is required during the splitting process.

We used the itertools method in this blog as it provides a compact and readable solution in just a few lines of code. The islice function is specifically designed for slicing iterators, making it well-suited for splitting lists into chunks.

The iterator-based approach is memory-efficient and suitable for handling large lists without loading the entire list into memory at once.

Conclusion

In this blog, we have explored how to split a list into evenly sized chunks using the itertools module in Python. By leveraging the islice and iter functions, we can efficiently divide a list into smaller segments, making it easier to handle large datasets or perform operations on specific chunks.  

This approach provides a flexible and efficient way to process and manipulate data in manageable chunks, enabling more effective handling of large datasets and facilitating operations that require segment-wise processing. The itertools module, with its powerful functions, offers convenient solutions for various data manipulation tasks in Python.

Frequently Asked Questions

Can I adjust the chunk size according to my requirements?

Yes, you can adjust the chunk size by providing a different value for the arr_size argument when calling the chunk function. This allows you to split the list into chunks of any desired size.

What happens if the list length is not evenly divisible by the chunk size?

If the list length is not evenly divisible by the chunk size, the last chunk will contain the remaining elements. It will be smaller than the other chunks but still follow the desired chunking pattern.

How can I apply this technique to other data structures, such as strings or tuples?

The code can be easily adapted to work with other iterable data structures, such as strings or tuples. Simply replace the range(30) with the desired data structure, and the code will split it into evenly sized chunks based on the provided chunk size.

What does the islice function do in the code?

The islice function is used to slice an iterator and retrieve a specified number of elements. In this code, it takes the original iterator (arr_range) and the desired chunk size (arr_size) as arguments to create chunks of the specified size.

How does the lambda function contribute to creating chunks?

The lambda function, combined with islice, generates chunks by repeatedly slicing the iterator (arr_range) to retrieve the specified number of elements (arr_size). It acts as a generator expression, producing tuples of elements as chunks until the iterator is exhausted.

About The Author

Leave a Reply