OOP in Python, part 7: The __new__() method
MP 46: A very technical method that can be helpful to understand.
Note: This is the seventh post in a series about OOP in Python. The previous post discussed the __str__()
and __repr__()
methods. The next post shares a number of ways you can write custom rules for comparing objects, and use symbols to define operations between objects.
So far we’ve covered most of the kinds of methods you can include in a class, and some of the most useful special methods that you can override as needed. There are a few more special methods that are worth knowing about. In this post, we’ll cover just one of these: the __new__()
method, which is responsible for creating new instances from a class.
This is one of the more technical posts I’ve written on Mostly Python. If this post doesn’t make much sense, consider skimming it and coming back for a reread after the discussion about inheritance in the posts that follow. I think this is a worthwhile deep dive, however. If you can make sense of what's presented here, you’ll come away with a pretty good understanding of what happens when you create an instance from a class. You’ll end up with a deeper understanding of the fundamental differences between mutable and immutable objects as well.1
The __new__()
method
When you create an instance from a class, Python automatically calls __new__()
. This method grabs a chunk of memory where the object will be stored, and returns the newly-created instance object. After running __new__()
, Python calls the __init__()
method if one is present.
Most of what people are tempted to do by overriding __new__()
can usually be done in __init__()
. The main reason to override __new__()
is if you’re subclassing one of Python’s immutable classes, such as a tuple.2
Excluding zero values from a list
Imagine you want to create a special kind of list, which will reject all zero values it’s given. If you pass it a sequence of numbers that includes some zeroes, you’ll get back a list with all of those numbers, except for the zeroes:
NonZeroList((1, 2, 0, 3, 5, 0)) -> [1, 2, 3, 5]
Here’s how you might do that:
class NonZeroList(list):
def __init__(self, values):
"""Only keep nonzero values."""
values = [v for v in values if v]
super().__init__(values)
my_list = NonZeroList((1, 2, 0, 3, 5, 0))
print(my_list)
We make a class called NonZeroList
that inherits from Python’s built-in list
class. In the __init__()
method, we use a list comprehension to keep only the values that evaluate to True
, which effectively gets rid of any zero values. We then use super()
to pass these remaining values to the __init__()
method of the built-in list
class.3
This approach works:
[1, 2, 3, 5]
Let’s run type()
and help()
on my_list
, to see exactly what we’ve created:
...
my_list = NonZeroList((1, 2, 0, 3, 5, 0))
print(my_list)
print(type(my_list))
help(my_list)
Here’s a portion of the output:
[1, 2, 3, 5]
<class '__main__.NonZeroList'>
Help on NonZeroList in module __main__ object:
class NonZeroList(builtins.list)
| NonZeroList(values)
...
| Methods inherited from builtins.list:
...
| append(self, object, /)
| Append object to the end of the list.
...
| pop(self, index=-1, /)
| Remove and return item at index (default last).
...
This output shows that my_list
is an instance of the NonZeroList
class. It also shows that my_list
has inherited all the methods of a standard list
object, such as append()
and pop()
.
Excluding zero values from a tuple
Let’s try excluding zero values from a tuple, using the same approach:
class NonZeroTuple(tuple):
def __init__(self, values):
"""Only keep nonzero values."""
values = [v for v in values if v]
super().__init__(values)
my_tuple = NonZeroTuple((1, 2, 0, 3, 5, 0))
print(my_tuple)
The only difference here is that NonZeroTuple
inherits from the built-in tuple
class instead of the built-in list
class.
This attempt fails:
Traceback (most recent call last):
File "nonzero_tuple.py", line 6, in __init__
super().__init__(values)
TypeError: object.__init__() takes exactly one argument
(the instance to initialize)
This is an interesting failure, and it gets at the heart of what tuples really are, and how Python creates instances from classes. Unless you’ve dug into these weeds on your own previously, it’s probably not at all obvious how to interpret this error.
It can be tempting to think of __init__()
as something that creates new objects. But it really doesn’t do that; that’s the job of __new__()
. Instead, __init__()
modifies an instance that was created by __new__()
. Since a tuple is immutable, you can’t really do anything with it in the __init__()
method. By the time __init__()
runs the tuple object has already been created, and once it’s been created it can’t be modified.
The error is happening in the line that tries to call super().__init__()
. The tuple
class doesn’t actually have an __init__()
method, so the call gets passed up to Python’s base object
class. That __init__()
method only takes one argument, “the instance to initialize”. When we pass it our sequence of values, it raises an exception.
Writing a custom new()
method
This all works, however, if we do the filtering for nonzero values in __new__()
. The end goal of __new__()
is to return a newly-created instance of the class. So we’ll override __new__()
instead of writing an __init__()
method. In our version of this method we’ll filter out the nonzero values, and then call the __new__()
method of the built-in tuple
class. We should end up with a tuple-like object containing only nonzero values.
Here’s the code:
class NonZeroTuple(tuple):
def __new__(cls, values):
"""Only keep nonzero values."""
values = [v for v in values if v]
return super().__new__(cls, values)
my_tuple = NonZeroTuple((1, 2, 0, 3, 5, 0))
print(my_tuple)
Here we’re working in __new__()
instead of __init__()
. This method is automatically passed a reference to the current class, which corresponds to the parameter cls
. We then pick up any remaining arguments in *values
.
The new instance isn’t actually created until we call super().__new__()
. Before we make that call, we’re free to do anything we want with the values we’ve received. So we first filter out the nonzero values, and then call the built-in __new__()
method. This method needs a reference to the class, and the values that the tuple will contain.
This works!
(1, 2, 3, 5)
Again, you can call type(my_tuple)
to see that this is an instance of NonZeroTuple
, and help(my_tuple)
to see that it has all the methods that a tuple has.
Conclusions
Most people are unlikely to ever need to override __new__()
when writing or working with classes. But knowing of its existence, and understanding the difference between what __new__()
does and what __init__()
does will give you a much clearer understanding of how Python carries out the work of creating instances from a class.
If you take nothing else away from this discussion, remember that __new__()
creates new objects in memory, associated with a particular class. The __init__()
method can then customize that new object by setting attributes and doing any other necessary initialization work as well.
If you end up needing to override __new__()
, check out the relevant section in Python’s Data model documentation.
Resources
You can find the code files from this post in the mostly_python GitHub repository.
Here’s a quick recap of inheritance, if it helps.
One class can inherit from another, by including that class name in parentheses in the class definition:
class Dog(Animal):
pass
In this tiny example, the Dog
class inherits from the Animal
class. This means the Dog
class starts out with all the attributes and methods already defined in the Animal
class. From that point forward, you’re free to add new attributes and methods that are specific to dogs, and override any attributes or methods in Animal
that are not relevant to dogs.
In this example, Animal
is the parent class and Dog
is the child class. Animal
is also called the superclass of Dog
, and Dog
is a subclass of Animal
. The function super()
, used in the Dog
class, is a reference to the parent or superclass of Dog
; in this case super()
is a reference to Animal
.
In this post we’ll write classes that inherit from the built-in list
and tuple
classes.
If you haven’t come across these terms before, mutable objects can be modified after they have been created. Immutable objects can not be changed once they’ve been created.
Most people make a simple list using square bracket notation:
my_list = [1, 2, 0, 3, 5, 0]
You can also make the same sequence by making a call to list()
:
my_list = list((1, 2, 0, 3, 5, 0))
The word list
in this snippet looks like a function, but it actually refers to the built in class called list
. Many of Python’s built-in classes are written in lowercase, unlike the convention we use for most of the classes we write.
There are so many features of Python that have similar names yet do different things. I really enjoy seeing them compared and contrasted like this, with concrete example code.