Understanding apply functions for better data analysis with Python

Amit Bhardwaj
2 min readMay 15, 2022

--

If you have started data cleaning and processing with Python, you must have come across apply function and its various implementation for modifying column(s).

Let’s go into the details one by one :

APPLY is a mighty function and it’s the most used while data manipulation and analysis too.

This is the official documentation for the function.

Using lambda/map function with apply function for a single column :

This is the most used type of lambda function for manipulating the columns. For instance, if you want to add :

  1. some number to the whole column
df['Grade'] = df['Grade'].apply(lambda x: x+20)

2. remove the string part from the entire column

df['Grade Name'] = df['Grade Name'].apply(lambda x: str(x).replace(' ',''))

Using lambda/map function with apply function for multiple columns :

In this case, we want to use two columns and create a third column out of it.

The first step here will be to create a small function which takes column values as inputs and then gives the output to create a third column.

In [49]: df
Out[49]:
0 1
0 1.000000 0.000000
1 -0.494375 0.570994
2 1.000000 0.000000
3 1.876360 -0.229738
4 1.000000 0.000000
In [50]: def f(x):
....: return x[0] + x[1]
....:
In [51]: df.apply(f, axis=1) #passes a Series object, row-wise
Out[51]:
0 1.000000
1 0.076619
2 1.000000
3 1.646622
4 1.000000

Second Example :

def lemmatize_root(row):
new = []
new = [lemmatizer.lemmatize(x) for x in row ]
return ' '.join(new)
for col in df.drop('title',1).columns:
df[col] = df[col].apply(lemmatize_root)

Hope it helps!
Thanks :)

--

--