When Should I Be Using Classes in Python

When should I be using classes in Python?

Classes are the pillar of Object Oriented Programming. OOP is highly concerned with code organization, reusability, and encapsulation.

First, a disclaimer: OOP is partially in contrast to Functional Programming, which is a different paradigm used a lot in Python. Not everyone who programs in Python (or surely most languages) uses OOP. You can do a lot in Java 8 that isn't very Object Oriented. If you don't want to use OOP, then don't. If you're just writing one-off scripts to process data that you'll never use again, then keep writing the way you are.

However, there are a lot of reasons to use OOP.

Some reasons:

  • Organization:
    OOP defines well known and standard ways of describing and defining both data and procedure in code. Both data and procedure can be stored at varying levels of definition (in different classes), and there are standard ways about talking about these definitions. That is, if you use OOP in a standard way, it will help your later self and others understand, edit, and use your code. Also, instead of using a complex, arbitrary data storage mechanism (dicts of dicts or lists or dicts or lists of dicts of sets, or whatever), you can name pieces of data structures and conveniently refer to them.

  • State: OOP helps you define and keep track of state. For instance, in a classic example, if you're creating a program that processes students (for instance, a grade program), you can keep all the info you need about them in one spot (name, age, gender, grade level, courses, grades, teachers, peers, diet, special needs, etc.), and this data is persisted as long as the object is alive, and is easily accessible. In contrast, in pure functional programming, state is never mutated in place.

  • Encapsulation:
    With encapsulation, procedure and data are stored together. Methods (an OOP term for functions) are defined right alongside the data that they operate on and produce. In a language like Java that allows for access control, or in Python, depending upon how you describe your public API, this means that methods and data can be hidden from the user. What this means is that if you need or want to change code, you can do whatever you want to the implementation of the code, but keep the public APIs the same.

  • Inheritance:
    Inheritance allows you to define data and procedure in one place (in one class), and then override or extend that functionality later. For instance, in Python, I often see people creating subclasses of the dict class in order to add additional functionality. A common change is overriding the method that throws an exception when a key is requested from a dictionary that doesn't exist to give a default value based on an unknown key. This allows you to extend your own code now or later, allow others to extend your code, and allows you to extend other people's code.

  • Reusability: All of these reasons and others allow for greater reusability of code. Object oriented code allows you to write solid (tested) code once, and then reuse over and over. If you need to tweak something for your specific use case, you can inherit from an existing class and overwrite the existing behavior. If you need to change something, you can change it all while maintaining the existing public method signatures, and no one is the wiser (hopefully).

Again, there are several reasons not to use OOP, and you don't need to. But luckily with a language like Python, you can use just a little bit or a lot, it's up to you.

An example of the student use case (no guarantee on code quality, just an example):

Object Oriented

class Student(object):
def __init__(self, name, age, gender, level, grades=None):
self.name = name
self.age = age
self.gender = gender
self.level = level
self.grades = grades or {}

def setGrade(self, course, grade):
self.grades[course] = grade

def getGrade(self, course):
return self.grades[course]

def getGPA(self):
return sum(self.grades.values())/len(self.grades)

# Define some students
john = Student("John", 12, "male", 6, {"math":3.3})
jane = Student("Jane", 12, "female", 6, {"math":3.5})

# Now we can get to the grades easily
print(john.getGPA())
print(jane.getGPA())

Standard Dict

def calculateGPA(gradeDict):
return sum(gradeDict.values())/len(gradeDict)

students = {}
# We can set the keys to variables so we might minimize typos
name, age, gender, level, grades = "name", "age", "gender", "level", "grades"
john, jane = "john", "jane"
math = "math"
students[john] = {}
students[john][age] = 12
students[john][gender] = "male"
students[john][level] = 6
students[john][grades] = {math:3.3}

students[jane] = {}
students[jane][age] = 12
students[jane][gender] = "female"
students[jane][level] = 6
students[jane][grades] = {math:3.5}

# At this point, we need to remember who the students are and where the grades are stored. Not a huge deal, but avoided by OOP.
print(calculateGPA(students[john][grades]))
print(calculateGPA(students[jane][grades]))

When should I use classes and self method in Python?

Classes and OOP are IMHO always a good choice, by using them, you will be able to better organize and reuse your code, you can create new classes that derive from an existing class to extend its functionality (inheritance) or to change its behavior if you need it to (polymorphism) as well as to encapsulate the internals of your code so it becomes safer (no real encapsulation in Python, though).

In your specific case, for example, you are building a calculator, that uses a technique to calculate an intersection, if somebody else using your class wants to modify that behavior they could override the function (this is Polymorphism in action):

class PointCalculator:
def intersection(self, P1, P2, dist1, dist2):
# Your initial implementation

class FasterPointCalculator(PointCalculator):
def __init__(self):
super().__init__()

def intersection(self, P1, P2, dist1, dist2):
# New implementation

Or, you might extend the class in the future:

class BetterPointCalculator(PointCalculator):
def __init__(self):
super().__init__()

def distance(self, P1, P2):
# New function

You may need to initialize your class with some required data and you may not want users to be able to modify it, you could indicate encapsulation by naming your variables with an underscore:

class PointCalculator:
def __init__(self, p1, p2):
self._p1 = p1
self._p2 = p2

def do_something(self):
# Do something with your data
self._p1 + self._p2

As you have probably noticed, self is passed automatically when calling a function, it contains a reference to the current object (the instance of the class) so you can access anything declared in it like the variables _p1 and _p2 in the example above.

You can also create class methods (static methods) and then you don't have access to self, you should do this for methods that perform general calculations or any operation that doesn't need a specific instance, your intersection method could be a good candidate e.g.

class PointCalculator:

@staticmethod
def intersection(P1, P2, dist1, dist2):
# Return the result

Now you don't need an instance of PointCalculator, you can simply call PointCalculator.intersection(1, 2, 3, 4)

Another advantage of using classes could be memory optimization, Python will delete objects from memory when they go out of scope, so if you have a long script with a lot of data, they will not be released from memory until the script terminates.

Having said that, for small utility scripts that perform very specific tasks, for example, install an application, configure some service, run some OS administration task, etc... a simple script is totally fine and it is one of the reasons Python is so popular.

Classes vs. Functions

Create a function. Functions do specific things, classes are specific things.

Classes often have methods, which are functions that are associated with a particular class, and do things associated with the thing that the class is - but if all you want is to do something, a function is all you need.

Essentially, a class is a way of grouping functions (as methods) and data (as properties) into a logical unit revolving around a certain kind of thing. If you don't need that grouping, there's no need to make a class.

When do I actually use classes

Classes don't appeal when you are coding small programs. But say you are working on a game using the Pygame module, Classes would come in handy. For example, creating an in-game character using Class definitely makes life easier, as it packs all the information and properties into it. On the contrary, making changes to a game programmed using procedure-oriented code would be extremely hard and annoying. Not to mention the length of the code, you also have to clearly state every possible outcome and event. In short, Class is a beneficial means exclusively available to Object-Oriented-Programming, which is why it is a high-level programming language. Class may not be necessary in many cases, but is still an excellent tool to make your code shorter and easier to read and amend.

When is it appropriate to organize code using a class with Python?

Since only sales_periods actually uses the instance attributes, and it returns a dict, not another instance of SalesTable, all the other methods can be moved out of the class and defined as regular functions:

class SalesTable:

def __init__(self, banner, start_year, start_month, end_year, end_month):
...

def sales_periods(self):
# ...
return some_dict

def find_sales_period_csv(dct):
return some_list

def csv_to_df(lst):
return some_list

def combine_dfs(lst):
return some_df

def check_data(df):
pass

And you'll call them all in a chained fashion:

x = SalesTable(...)
check_data(combine_dfs(csv_to_df(find_sales_period_csv(x.sales_periods()))))

Now take a closer look at your class: you only have two methods, __init__ and sales_periods. Unless __init__ does something expensive that you don't want to repeat (and you would call sales_periods on the same instance multiple times), the entire class can be reduced to a single function that combines __init__ and the sales_period method:

def sales_periods(banner, start_year, start_month, end_year, end_month):
...
return some_dict

check_data(combine_dfs(csv_to_df(find_sales_period_csv(sales_periods(...)))))

Why do we use classes in Python?

This question truly is too broad to get a complete answer, so let me give you one that fits the scope of your example, rather than the scope of your question.

As a few commenters have noticed, you use classes to give your functions a stateful environment to run in. Imagine I wanted to create a counter, for instance. I could use a global variable and a function, like:

counter = 0
def increment_counter():
global counter
counter += 1
def decrement_counter():
global counter
counter -= 1

but this pollutes our namespace with two extra function signatures and a variable. Plus the global keyword is a code smell that should be avoided when possible. Not to mention you can only have one such counter in your whole code base! If you needed another, you'd have to retype all that code and violate DRY (Don't Repeat Yourself).

Instead we create a class:

class Counter(object):
def __init__(self):
self.count = 0
# initialize count at zero
def increment(self, by=1):
self.count += by
def decrement(self, by=1):
self.count -= by

Now you can instantiate that as many times as you'd like, and each one keeps track of its own separate count, e.g.:

counter_a = Counter()
counter_b = Counter()

for _ in range(3):
counter_a.increment()
for _ in range(1000000):
counter_b.increment()

assert counter_a.count == 3
assert counter_b.count == 1000000

Classes vs Function: Do I need to use 'self' keyword if using a class in Python?

If you develop functions within a Python class you can two ways of defining a function: The one with a self as first parameter and the other one without self.

So, what is the different between the two?

Function with self

The first one is a method, which is able to access content within the created object. This allows you to access the internal state of an individual object, e.g., a counter of some sorts. These are methods you usually use when using object oriented programming. A short intro can be fund here [External Link]. These methods require you to create new instances of the given class.

Function without self

Functions without initialising an instance of the class. This is why you can directly call them on the imported class.

Alternative solution

This is based on the comment of Tom K. Instead of using self, you can also use the decorator @staticmethod to indicate the role of the method within your class. Some more info can be found here [External link].

Final thought

To answer you initial question: You do not need to use self. In your case you do not need self, because you do not share the internal state of an object. Nevertheless, if you are using classes you should think about an object oriented design.

Is there any reason for using classes in Python if there is only one class in the program?

One advantage, though not always applicable, is that it makes it easy to extend the program by subclassing the one class. For example I can subclass it and override the method that reads from, say, a csv file to reading an xml file and then instantiate the subclass or original class based on run-time information. From there, the logic of the program can proceed normally.

That of course raises the question of if reading the file is really the responsibility of the class or more properly belongs to a class that has subclasses for reading different types of data and presents a uniform interface to that data but that is, of course, another question.

Personally, I find that the cleanest way to do it is to put functions which make strong assumptions about their parameters on the appropriate class as methods and to put functions which make very weak assumptions about their arguments in the module as functions.

When should I write a class instead of a group of functions?

1) Read about SOLID principles. For starters the first principle is enough - the Single Responsibility Principle. Usually you don't leave loose, hanging methods. Rule of thumb is to build classes, that have one, precisely defined responsibility. You may define some utility functions (if you need them and Python doesn't provide them already), however you usually group them together in one separate module.

2) How much commenting is enough? That's an easy question - no comments at all should be enough. Code should document itself, by well-named classes and functions. Cool quotation:

Code should read like well-written prose

The harder questions is - how much commenting is too much? Too much comments are when they can be replaced by better function/classes names and better partitioning of the code.

Comments as a means of explaining code are really just a historical left-over mechanism, when for example variables and functions needed to have short names. Nowadays you don't have to use "d" for a variable name, you can call it "invoiceDueDate" and use it conveniently. Using comments to document an API of a public library is IMHO the only good reason to use them.



Related Topics



Leave a reply



Submit