-
Notifications
You must be signed in to change notification settings - Fork 54
strange behaviour when using filter + if_else #474
Copy link
Copy link
Open
Description
Hi - the following example produces strange results:
import siuba as sb
from siuba import _, mutate, count, if_else
from siuba.data import penguins
print(f'initial rows:{penguins.shape[0]}')
dat = penguins >> sb.filter(_.island != "Torgersen")
print(f'rows after filtering:{dat.shape[0]}')
dat = dat >> mutate(
binary_col = if_else(_.island == 'Biscoe', 1, 0)
)
dat_count = dat >> count(_.binary_col )
print(dat_count)
I use a filter to drop some of the rows. When using mutate on the filtered dataframe the previously dropped rows
somehow still appear in the dataframe.
I would expect a count output like:
binary_col n
0 0.0 110
1 1.0 130
but the dropped observations get labeled with NaN
binary_col n
0 0.0 110
1 1.0 130
2 NaN 52
What am I doing wrong?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels