Ave-command i R statistics

Imaging you have some data on unemployment:

unemployment.rate <- c(0.01, 0.17, 0.19, NA, 0.21, 0.14, 0.02,NA, 0.26, 0.27, 0.21, 0.28, 0.23, 0.16, 0.1, NA, 0.23, 0.03, 0.11)
cntry <- c ("SE", "NO", "DK", "SE", "NO", "SE", "DK", "DK", "NO", "DK", "SE", "DK", "DK", "SE", "DK", "SE", "SE", "DK", "NO")
size <- c("Big","Medium","Big","Big","Medium","Small","Big","Medium","Medium","Big","Small","Medium","Medium","Big","Medium","Big","Big","Big","Small")
df <- data.frame(unemployment_rate, cntry, size)

In the data we have some NA. We want to replace these based on the mean of cntry. This can be done using the ave-command (from R-base):

df$unemployment_rate <- ave(df$unemployment.rate, list(df$cntry), FUN=function(x) {x[is.na(x)] <- mean(x,na.rm=TRUE); x; })

We can actually base the new data on both the mean fron cntry and the mean from size.

df$unemployment_rate <- ave(df$unemployment.rate, list(df$cntry, df$size), FUN=function(x) {x[is.na(x)] <- mean(x,na.rm=TRUE); x; })

Amazing! (edited)

Lämna ett svar

E-postadressen publiceras inte. Obligatoriska fält är märkta *

Denna webbplats använder Akismet för att minska skräppost. Lär dig hur din kommentardata bearbetas.