Packing & Unpacking
- Packing & Unpacking
1. What is a tuple in Python?
Tuples are a fundamental data structure in Python used to store collections of items. Unlike lists, tuples are immutable, meaning their contents cannot be modified after they are created. This immutability allows Python to optimize their internal storage, making tuples highly efficient in terms of memory consumption and speed.
Creating Tuples
The primary syntax to create a tuple in Python involves commas, not parentheses. Parentheses are optional but recommended for clarity.
Here are some examples of how you can create a tuple.
numbers = 1, 2, 3
print(numbers)
# Output: (1, 2, 3)
You can create a single-element Tuple like this. If you add a comma (,) after the element, it will be changed to a Tuple type. So, you should be careful when using the comma (,) in Python.
single = 1,<br>print(single) <br><br># Output: (1,)<br><br>
Why Tuples are Immutable and Efficient?
Tuples are immutable primarily for performance and security reasons. Python internally stores tuples in a contiguous block of memory. When creating a tuple, Python allocates the exact required amount of memory to hold all items, thus enabling faster access to elements. Furthermore, immutability allows Python to cache and reuse frequently used tuples, thereby optimizing memory usage.
Internal Storage of Tuples in Python
Python stores tuples using a fixed-size array in memory. Each tuple object includes:
- A reference count (to manage memory usage automatically).
- The type of object (indicating it’s a tuple).
- The tuple length (number of elements).
- An array of pointers referencing each element contained in the tuple.
This simple structure enhances both memory efficiency and speed for tuple operations.
Practical Use-Cases for Tuples
- As dictionary keys (immutable requirement).
- For returning multiple values from functions.
- For defining fixed data collections that shouldn’t change throughout the program.
2. Packing & Unpacking
Packing in Python is the process of collecting multiple values into a single container (usually a tuple).
On the other hand, unpacking is the opposite operation—extracting individual values from a container (such as a tuple, list, or other iterable) into separate variables.
a, b, c = [1, 2, 3]
print(a) # Output: 1
print(b) # Output: 2
print(c) # Output: 3
How does a, b = b, a work internally?
The statement a, b = b, a is a classic example of tuple packing and unpacking used for swapping two variables without using a temporary variable. In many other programming languages, a temporary variable is required to hold one of the values during a swap operation, to prevent data loss caused by overwriting one variable before its value is reassigned.
However, Python does not need the temporary variable.
Here’s how it works step by step.
- Evaluate right-hand side:
b, a→ This expression is evaluated first.- At this moment:
b = 2a = 1
- So the right-hand side becomes the tuple
(2, 1)(via packing).
- Tuple Packing (RHS).
- Python packs the values into a temporary tuple.
temp = (2, 1)
- Python packs the values into a temporary tuple.
- Unpacking to the left-hand side.
- Python unpacks the tuple into variables
a = temp[0] # a = 2b = temp[1] # b = 1
- Python unpacks the tuple into variables
- Final Result
a = 2b = 1
Because Python internally creates a temporary tuple during the assignment. This happens atomically at the bytecode level, so there’s no need for an explicit temporary variable like in C-style languages.
Let’s check with bytecode. You can generate the Python bytecode with the dis package.
import dis
def swap(a, b):
a, b = b, a
return a, b
dis.dis(swap)
Here’s what I get from my code.
3 0 RESUME 0
4 2 LOAD_FAST 1 (b)
4 LOAD_FAST 0 (a)
6 STORE_FAST 1 (b)
8 STORE_FAST 0 (a)
5 10 LOAD_FAST 0 (a)
12 LOAD_FAST 1 (b)
14 BUILD_TUPLE 2
16 RETURN_VALUE
2 LOAD_FAST 1 (b)pushes the valuebonto the stack.4 LOAD_FAST 0 (a)pushes the valueaonto the stack. Now the stack contains[b, a].6 STORE_FAST 1 (b)pops the top value from the stack (which isa) and stores it in a variableb.8 STORE_FAST 0 (a)pops the next value from the stack (which isb) and stores it in a variablea.10 LOAD_FAST 0 (a)loads the valueaonto the stack.12 LOAD_FAST 1 (b)loads the valuebonto the stack.14 BUILD_TUPLE 2combines the top two values on the stack into a tuple.16 RETURN_VALUEreturns the tuple from the function.
Python’s a, b = b, a Syntax might look like magic, but under the hood, it’s a simple, efficient sequence of bytecode operations. Here’s a breakdown of what makes it elegant.
- Python evaluates the right-hand side first, forming an implicit tuple.
- Then, it unpacks the values to the left-hand variables.
- The entire operation is executed via the Python Virtual Machine (PVM) stack using just a few fast instructions.
3. * operator
If you’ve ever written a Python function that accepts a variable number of positional arguments, you’ve likely encountered the *args syntax. Or maybe you’ve seen it used in assignments like a, *b, c = [1, 2, 3, 4, 5]. At first glance, this syntax might seem like magic, but under the hood, Python has an exquisite and consistent way of handling it.
Let’s take a look at how *args works—both in assignment unpacking and in function definitions. We’ll peel back the abstraction and explore what happens inside the Python interpreter.
Unpacking with * in Assignment
You can use * operator to assign the remaining elements. In my experience, this is not often used, but it may be necessary in some cases.
a, *b, c = [1, 2, 3, 4, 5]
Python handles this by scanning the left-hand side of the assignment and checking where the starred expression appears. In this case, it is seen that *b is in the middle. During an assignment, Python splits the iterable ([1, 2, 3, 4, 5]) such that the values on both sides of *b are matched to fixed variables (a and c), and the rest is collected into a list and bound to b.
Internally, this is handled at the interpreter level by calculating the number of values being unpacked and the number of targets to which they are assigned. Python allows exactly one starred target in an unpacking assignment. Once the position of the starred target is known, Python determines the following things.
- How many fixed elements are before and after the star?
- How many elements remain to fill the target marked with a star?
- Whether the number of elements in the right-hand side is sufficient or excessive, raising a
ValueErrorif not.
This logic lives deep inside Python’s evaluation loop, particularly in the unpack_ex opcode, which is used for starred unpacking assignments. It’s a bit of magic in the bytecode, but very predictable once you understand the pattern.
This form of usage is less common in return value handling, as functions typically return a fixed number of values, which are then matched to a corresponding number of variables.
However, in the context of function invocation, the * operator is often employed for argument forwarding. It facilitates the unpacking of an iterable into positional arguments, allowing developers to delegate calls to other functions without having to hardcode the argument list. This pattern is especially advantageous in scenarios where the arity of the called function may vary or evolve, thereby enhancing code flexibility and maintainability.
Using * to Unpack Iterables into a List
a = [1, 2, 3]
b = 'hi'
c = [*a, *b]
This expression creates a new list c by unpacking the contents of list a and the string b.
*aunpacks[1, 2, 3]into individual elements:1,2,3*bunpacks the string'hi'into'h','i'cbecomes:[1, 2, 3, 'h', 'i']
At the bytecode level, Python compiles this to use the <a href="https://docs.python.org/3.6/library/dis.html#opcode-BUILD_LIST_UNPACK">BUILD_LIST_UNPACK</a> opcode, which takes multiple iterable objects from the stack and unpacks them into a single list.
Here’s the rough process.
aandbare pushed onto the stack.- Python confirms both are iterable.
- The contents of both iterables are expanded and concatenated.
- A new list is constructed and assigned to
c.
This behavior was introduced in Python 3.5 via PEP 448, which generalized unpacking syntax to support arbitrary iterables in list and set literals.
Unpacking a Dictionary in an Assignment Context
d = {'a': 1, 'b': 2, 'c': 3}
x, *y = d
This line performs iterable unpacking on the dictionary d, not on its keys or values explicitly. In Python, iterating over a dictionary by default yields its keys.
xis assigned'a'ybecomes a list of the remaining keys:['b', 'c']
This is handled using the same UNPACK_EX opcode mentioned in unpacking syntax. Here’s what happens under the hood:
- The dictionary
dis passed to the unpacking operation. - Python implicitly calls
iter(d), which returns an iterator over the dictionary’s keys. - The unpacking pattern
x, *ycauses:- The first item to be assigned to
x - The rest will be collected into a list and assigned to
y
- The first item to be assigned to
This mechanism works with any iterable, not just sequences like lists or tuples. The key requirement is that the object must implement the __iter__() method.
*args in Function Definitions
Here’s a classic example
def my_sum(*args):
return sum(args)
When you call my_sum(1, 2, 3), Python binds 1, 2, and 3 into the single name args, which is a tuple: (1, 2, 3).
Here’s how this works internally!
When a function is defined with *args, Python marks that parameter in the function’s code object using a flag called CO_VARARGS. This tells the interpreter that any additional positional arguments passed to the function should be collected into a tuple and assigned to args.
When the function is called, Python does the following steps.
- It binds any explicitly named positional parameters first.
- Then, if more positional arguments are provided than expected, and a
*argsparameter exists, Python collects the “extra” arguments into a tuple. - This tuple is assigned to the
argsvariable within the function.
This mechanism is entirely runtime-driven and handled by the function call machinery inside CPython (specifically, call_function and related internal calls).
Importantly, *args only collects positional arguments, not keyword arguments. That’s what **kwargs is for. However, the principle is similar: converting a variable-length collection into a unified container.
Interestingly, if you define a function without a *args parameter and pass too many arguments, Python raises a TypeError. But if you include *args, it absorbs any excess.
def greet(name, *args):
print(f"Hello {name}")
print("Extra args:", args)
greet("Alice", "extra1", "extra2")
# Output
Hello Alice
Extra args: ('extra1', 'extra2')
Again, note that args is always a tuple, even if you pass zero extra arguments. In that case, args is simply an empty tuple.
Summary
- Tuples are immutable, memory-efficient sequences used for storing fixed data. They’re created with commas, and Python optimizes their storage for speed and performance.
- Packing and unpacking let you assign or swap multiple values easily. Python internally handles this with tuple construction and stack-based bytecode operations, removing the need for temporary variables.
- The
*operator enables flexible unpacking and argument forwarding, whether in assignments or functions. Internally, Python uses specialized opcodes and flags to manage variable-length arguments and starred expressions efficiently.