Skip to content Skip to sidebar Skip to footer

Pandas Series.rename Not Reflected In Dataframe Columns

I'm trying to rename a column by validating the values in the particular columns. Here is the set-up: In [9]: import pandas as pd In [10]: df = pd.DataFrame( ...: {'un

Solution 1:

Re-write your function to accept two parameters:

defvalidate_column(df, col_name):
    # Value validation method returns that this column is email columnreturn df.rename({col_name : 'email'}, axis=1)

Now, call your function through DataFrame.pipe:

df.pipe(validate_column, col_name='unknown_field')

               email
0      bob@gmail.com
1  shirley@gmail.com
2     groza@pubg.com

Very clean. This is useful if you want to chain validations:

(df.pipe(validate_column, col_name='unknown_field')
   .pipe(validate_column, col_name='some_other_field')
   .pipe(validate_column, col_name='third_field')
)

... or modify validate_column to validate multiple columns at a time.

Note that the renaming is no longer done in-place, and whatever result is returned from pipe needs to be assigned back.

Solution 2:

Use dataframe's rename function and set columns argument.

import pandas as pd
df = pd.DataFrame({"unknown_field": ['bob@gmail.com', 'shirley@gmail.com', 'groza@pubg.com']})
df = df.rename(columns={'unknown_field': 'email'})

Output:

    email
0   bob@gmail.com
1   shirley@gmail.com
2   groza@pubg.com

Post a Comment for "Pandas Series.rename Not Reflected In Dataframe Columns"