Skip to content Skip to sidebar Skip to footer

Pandas: Filtering Pivot Table Rows Where Count Is Fewer Than Specified Value

I have a pandas pivot table that looks a little like this: C bar foo A B one A -1.154627 -0.243234 three A -1.327977 0.243234 B

Solution 1:

In one line:

In[64]: df[df.groupby(level=0).bar.transform(lambda x: len(x) >= 2).astype('bool')]Out[64]: 
              barfootwoA0.9449080.701687B-0.2040750.713141C0.730844-0.022302threeA1.263489-1.382653B0.1244440.907667C-2.407691-0.773040

In the upcoming release of pandas (11.1), the new filter method achieves this faster and more intuitively:

In[65]: df.groupby(level=0).filter(lambda x: len(x['bar']) >= 2)
Out[65]: 
              barfoothreeA1.263489-1.382653B0.1244440.907667C-2.407691-0.773040twoA0.9449080.701687B-0.2040750.713141C0.730844-0.022302

Solution 2:

One way is to groupby the 'A', and look at those groups over size 3:

In [11]: g = df1.groupby(level='A')

In [12]: g.size()
Out[12]:
A
one      1
three    3
two      3
dtype: int64

In [13]: size = g.size()

In [13]: big_size = size[size>=3]

In [14]: big_size
Out[14]:
A
three    3
two      3
dtype: int64

Then you can see which rows have "good" 'A' values, and slice by these:

In [15]: good_A = df1.index.get_level_values('A').isin(big_size.index)

In [16]: good_A
Out[16]: array([False,  True,  True,  True,  True,  True,  True], dtype=bool)

In [17]: df1[good_A]
Out[17]:
              bar       foo
A     B
three A -1.3279770.243234
      B  1.327977-0.079051
      C -0.8325061.327977
two   A  1.327977-0.128534
      B  0.8351201.327977
      C  1.3279770.838040

Post a Comment for "Pandas: Filtering Pivot Table Rows Where Count Is Fewer Than Specified Value"