r - Fill elements of column based on multiple criteria -


i have data frame want remove week contains outlier. happy if can indicate entire week outlier, understand how subset there. have not been able come appropriate solution. keep thinking going need loop through subsets of weeks achieve desired goal, or create separate function handle individual outlier week , use sapply. have yet make either of these solutions viable.

date <- seq(as.date("2015-01-01"), length=365, by="1 day") dow <- as.factor(weekdays(as.date(date)) df <- data.frame(cbind(date, dow)) df$date <- as.date(df$date,format="%m/%d/%y",origin="01/01/1970") df$dow <- as.factor(weekdays(as.date(df$date))) set.seed(1115) df$var1 <- rnorm(365, 1912, 40795) stdev <- sd(df$var1, na.rm=true) avg <- mean(df$var1, na.rm=true) df$lb <- avg-(2.75*stdev) df$ub <- avg+(2.75*stdev) df$outlier <- ifelse(df$var1<df$lb | df$var1>df$ub, 1,0) df$weeknum <- as.numeric(format(df$date, "%u")) head(df, 17)  > head(df, 17)          date       dow       var1        lb       ub outlier weeknum 1  2015-01-01  thursday  -7828.412 -114675.6 120479.8       0       0 2  2015-01-02    friday  25674.456 -114675.6 120479.8       0       0 3  2015-01-03  saturday -33588.871 -114675.6 120479.8       0       0 4  2015-01-04    sunday -54418.175 -114675.6 120479.8       0       1 5  2015-01-05    monday -10002.002 -114675.6 120479.8       0       1 6  2015-01-06   tuesday  34050.390 -114675.6 120479.8       0       1 7  2015-01-07 wednesday -37584.648 -114675.6 120479.8       0       1 8  2015-01-08  thursday  84048.878 -114675.6 120479.8       0       1 9  2015-01-09    friday -24801.346 -114675.6 120479.8       0       1 10 2015-01-10  saturday  33974.637 -114675.6 120479.8       0       1 11 2015-01-11    sunday  77432.088 -114675.6 120479.8       0       2 12 2015-01-12    monday 128196.236 -114675.6 120479.8       1       2 13 2015-01-13   tuesday   9740.418 -114675.6 120479.8       0       2 14 2015-01-14 wednesday  26539.887 -114675.6 120479.8       0       2 15 2015-01-15  thursday  12172.834 -114675.6 120479.8       0       2 16 2015-01-16    friday   1032.544 -114675.6 120479.8       0       2 17 2015-01-17  saturday  76870.095 -114675.6 120479.8       0       2 

in above example, desired output 1 outlier column in each row corresponds weeknum = 2.

you "the desired output 1 outlier column in each row corresponds weeknum = 2." need outlier column, then? seems can subset data.frame based on values of weeknum column, follows:

df <- df[!(df$weeknum==2),] 

Comments

Popular posts from this blog

javascript - jQuery: Add class depending on URL in the best way -

caching - How to check if a url path exists in the service worker cache -

Redirect to a HTTPS version using .htaccess -