r - Group by combined with matching -
small rep of data set:
team1 <- c("atl", "chi", "cle", "det", "gsw", "nop", "bkn","atl", "phi","chi") home.away <- c("vs.", "vs.", "@", "@", "vs.", "@", "vs.","vs.", "@","@") team2 <- c("det", "cle", "chi", "atl", "nop", "gsw", "chi","phi", "atl","bkn") date <- as.date(c("2015-05-14", "2015-05-14", "2015-05-14", "2015-05-14","2015-05-14", "2015-05-14", "2015-05-15","2015-05-15", "2015-05-15","2015-05-15")) pts <- c(94, 97, 95, 106, 111, 95, 100,112,87, 94) df <- data.frame(team1,home.away,team2,pts,date) df team1 home.away team2 pts date atl vs. det 94 2015-05-14 chi vs. cle 97 2015-05-14 cle @ chi 95 2015-05-14 det @ atl 106 2015-05-14 gsw vs. nop 111 2015-05-14 nop @ gsw 95 2015-05-14 bkn vs. chi 100 2015-05-15 atl vs. phi 112 2015-05-15 phi @ atl 87 2015-05-15 chi vs. bkn 94 2015-05-15
the dataframe organized team level. each game, create 2 rows of data. example, atlanta vs detroit (first row) , detroit vs atlanta (fourth row). dataframe includes boxscore (pts, reb, ast...) team1. example included points scored variable. create new variable "points scored opponent team".
output this:
team1 home.away team2 pts date pts.oppt atl vs. det 94 2015-05-14 106 chi vs. cle 97 2015-05-14 95 cle @ chi 95 2015-05-14 97 det @ atl 106 2015-05-14 94 gsw vs. nop 111 2015-05-14 95 nop @ gsw 95 2015-05-14 111 bkn vs. chi 100 2015-05-15 94 atl vs. phi 112 2015-05-15 87 phi @ atl 87 2015-05-15 112 chi vs. bkn 94 2015-05-15 100
i tried using group date , sort of matching couldn't figure out matching part.
> team1 <- c("atl", "chi", "cle", "det", "gsw", "nop", "bkn","atl", "phi","chi") > home.away <- c("vs.", "vs.", "@", "@", "vs.", "@", "vs.","vs.", "@","@") > team2 <- c("det", "cle", "chi", "atl", "nop", "gsw", "chi","phi", "atl","bkn") > date <- as.date(c("2015-05-14", "2015-05-14", "2015-05-14", + "2015-05-14","2015-05-14", "2015-05-14", "2015-05-15","2015-05-15", + "2015-05-15","2015-05-15")) > pts <- c(94, 97, 95, 106, 111, 95, 100,112,87, 94) > df <- data.frame(team1,home.away,team2,pts,date) > > df<-merge(df, df, by.x=c("team1", "team2", "date"), by.y=c("team2", "team1", "date")) > df<-df[,c("team1", "home.away.x", "team2", "pts.x","date", "pts.y" )] > names(df)<-c("team1", "home.away", "team2","pts", "date", "pts.oppt") > df team1 home.away team2 pts date pts.oppt 1 atl vs. det 94 2015-05-14 106 2 atl vs. phi 112 2015-05-15 87 3 bkn vs. chi 100 2015-05-15 94 4 chi @ bkn 94 2015-05-15 100 5 chi vs. cle 97 2015-05-14 95 6 cle @ chi 95 2015-05-14 97 7 det @ atl 106 2015-05-14 94 8 gsw vs. nop 111 2015-05-14 95 9 nop @ gsw 95 2015-05-14 111 10 phi @ atl 87 2015-05-15 112
Comments
Post a Comment