Pandas: Find Most Common String Per Person
I would like find the most common string value in animal when aggregating data by id, if the count is the same, I would like to pick the last value of animal. id animal
Solution 1:
group by id
& animal
columns and get the count
and last
date on which they appeared.
then sort the resulting data frame by id
, count
, last
and drop duplicate values on id
, keeping the last row, which due to our ordering, will give the most common animal, and if there are two animals, the animal that was last observed in the table. finally, get rid of the extra columns count
& last
columns = ['id', 'animal']
df2 = df.groupby(columns).date.agg(['count', 'last']).reset_index()
df3 = df2.sort_values(['id', 'count', 'last'])
df3.drop_duplicates('id', keep='last')[columns]
# outputs:
id animal
1 1 dog
2 2 cat
3 3 dog
4 4 fish
5 5 cat
Solution 2:
You can define your custom rule and aggregate
using it
from collections import Counter
def rule(a):
m = Counter(a)
max_val = sorted(m.values())[-1]
return max(a) if m.values().count(max_val) == 1 else a.tail(1).item()
df.groupby("id").aggregate(rule)
Output:
animal
id
1 dog
2 cat
3 dog
4 fish
5 cat
Post a Comment for "Pandas: Find Most Common String Per Person"