Pandas Series.rename Not Reflected In Dataframe Columns
I'm trying to rename a column by validating the values in the particular columns. Here is the set-up: In [9]: import pandas as pd In [10]: df = pd.DataFrame( ...: {'un
Solution 1:
Re-write your function to accept two parameters:
defvalidate_column(df, col_name):
# Value validation method returns that this column is email columnreturn df.rename({col_name : 'email'}, axis=1)
Now, call your function through DataFrame.pipe
:
df.pipe(validate_column, col_name='unknown_field')
email
0 bob@gmail.com
1 shirley@gmail.com
2 groza@pubg.com
Very clean. This is useful if you want to chain validations:
(df.pipe(validate_column, col_name='unknown_field')
.pipe(validate_column, col_name='some_other_field')
.pipe(validate_column, col_name='third_field')
)
... or modify validate_column
to validate multiple columns at a time.
Note that the renaming is no longer done in-place, and whatever result is returned from pipe
needs to be assigned back.
Solution 2:
Use dataframe's rename function and set columns argument.
import pandas as pd
df = pd.DataFrame({"unknown_field": ['bob@gmail.com', 'shirley@gmail.com', 'groza@pubg.com']})
df = df.rename(columns={'unknown_field': 'email'})
Output:
email
0 bob@gmail.com
1 shirley@gmail.com
2 groza@pubg.com
Post a Comment for "Pandas Series.rename Not Reflected In Dataframe Columns"