There are machine learning and data scientists around me. Python is their preferred language. However, not all of them are experienced Python developers, and they are unlikely to master all the excellent features provided by python. This is understandable, of course, but it is also unfortunate. Why? Because understanding the details of the language requires writing code
That's why I want to help those who improve their Python skills, so that you can write more excellent code, which may impress your partners or colleagues and have more fun! Specifically, in this article, I want to talk about how to use the magic methods in Python to write amazing class es. Let's start.
What is magic
Magic methods are first methods, which are functions belonging to classes. They can be either instance methods or class methods. You can easily identify them because they all start and end with double underscores, that is, they all look like __ actual_name__.
Importantly, magic methods cannot be called directly! Of course, you can do this and write something like yourclass ()__ actual_ name__ (), but please do not call it directly.
So how is the magic method called? They will be called when appropriate. For example, calling str(YourClass()) will call the magic method__ str__ Or YourClass() + YourClass() will call__ add__, If you have implemented these two magic methods.
So what's the use of magic? It allows us to write classes that can be used with python built-in methods, so that the written code is easier to read and less redundant.
In order to emphasize the usefulness of magic methods and understand how to benefit from their use in machine learning or data science, let's give a specific example.
Instance: datetime class of custom scope
The following code shows how to write a DateTimeRange class similar to the built-in range function using the magic method. The code is as follows:
from datetime import datetime, timedelta from typing import Iterable from math import ceil class DateTimeRange: def __init__(self, start: datetime, end_:datetime, step:timedelta = timedelta(seconds=1)): self._start = start self._end = end_ self._step = step def __iter__(self) -> Iterable[datetime]: point = self._start while point < self._end: yield point point += self._step def __len__(self) -> int: return ceil((self._end - self._start) / self._step) def __contains__(self, item: datetime) -> bool: mod = divmod(item - self._start, self._step) # divmod return the tuple (x//y, x%y). Invariant: div*y + mod == x. return item >= self._start and item < self._end and mod[1] == timedelta(0) def __getitem__(self, item: int) -> datetime: n_steps = item if item >= 0 else len(self) + item return_value = self._start + n_steps * self._step if return_value not in self: raise IndexError() return return_value def __str__(self): return f"Datetime Range [{self._start}, {self._end}) with step {self._step}" def main(): my_range = DateTimeRange(datetime(2021,1,1), datetime(2021,12,1), timedelta(days=12)) print(my_range) print(f"{len(my_range) == len(list(my_range)) = }") print(f"{my_range[-2] in my_range = }") print(f"{my_range[2] + timedelta(seconds=12) in my_range = }") for r in my_range: print(r) #do_something(r) if __name__ == '__main__': main()
First look at the operation results:
Seeing the running results, you may be able to understand the role of the class DateTimeRange faster. There is a lot of code, but don't worry, I'll explain.
In general, the above code implements six different magic methods:
1,__ init__ method. You must know that this method is mainly used to initialize the instance properties of your class. Here, we pass the start and end of the range class together with the step size to DateTimeRange.
2,__ iter__ method. Called when a for loop or list(DateTimeRange()). This is probably the most important one because it generates all the elements in our date time range. This function is a so-called generator function that creates an element one at a time, gives it to the caller, and allows the caller to process it. It does this until it reaches the end of the range. When viewing the yield keyword, you can easily identify the generator function. This statement pauses the function, saves all its state, and then continues from there on successive calls. This allows you to use one element at a time and use it without having to put each element in memory.
When the range is large, putting everything in memory becomes very memory intensive. For example, when executing list(DateTimeRange(datetime(1900,1,1), datetime(2000,1,1)), 3184617600 datetimes will be put into memory. It is too large, however, you can easily process these elements one by one with the generator.
3. Now you can see that it is not a list or tuple. However, in order to handle this DateTimeRange class as if it were a list or tuple, I added three other magic methods, namely ______________________.
With _len_, you can find out the number of elements belonging to your scope by calling len(my_range). For example, this becomes very useful when you iterate over all elements and want to know how many elements have been processed from all available elements. It may also tell you, hey, I have a lot of data to process, please have a cup of coffee.
With _contains _, you can use the built-in syntax elements in my_range to check whether an element belongs to your scope. The advantage of a given implementation is that this is done using pure mathematics without comparing the given element with all the elements in the scope. This means that checking whether an element is within your scope is a constant time operation and does not depend on the actual scope Similarly, it will become very convenient for the large range we often see when processing data.
With _getitem you can use the index syntax to retrieve entries from objects. Therefore, you can get the last element of our scope through my_range[-1]. In general, with _getitem you can write a very clean and readable interface.
4. The _str method is used to convert an instance of a class to a string. This method is called automatically when the instance is converted to a string. For example, _str is called when print(my_range) or str(my_range) is called.
Last words
This article shares how to write a very elegant class through magic methods. Magic methods can be called automatically in Python's built-in functions or operations, which allows us to write classes with better readability and ease of use, just like DateTimeRange in this article.