r - Combine two dataframes that have common rows and columns (fill in) -


i have big dataframe want fill in results sql queries many databases, kind of "filling in data cubbyholes" speak. wrinkle: don't know how many cubbyholes filled in (there's group_by year, might dataframe 1 year, or many).

i'm having difficulty figuring out how accomplish this. i'm trying use dplyr package..

  • left_join either adds same row twice (if specify by=), or drops new column (if don't specify by= , joins on both similar columns)enter image description here

    enter image description here

  • bind_cols doesn't work

  • bind_rows adds duplicate row.

    enter image description here

how can new data fill in cubbyhole itself? (btw, i'm not married dplyr... don't want iterate through every element of new dataframe)

code follows:

library(dplyr) targetdf <- structure(list(ind = c(5, 6, 7), `2015 act` = c(7870l, na, na                                                             )), .names = c("ind", "2015 act"), class = c("tbl_df", "data.frame"                                                                                                          ), row.names = c(na, -3l))  tempdf <- structure(list(ind = 6, `2015 act` = 49782l, `2016 act` = 323l), .names = c("ind",                                                                                        "2015 act", "2016 act"), class = c("tbl_df", "tbl", "data.frame"                                                                                       ), row.names = c(na, -1l)) left_join(targetdf,tempdf, by= "ind") ## gives duplicate columns  left_join(targetdf,tempdf) ## loses new "2015 act" data ind 6  bind_cols(targetdf,tempdf) ## don't work  bind_rows(targetdf,tempdf) ## double ind 6 (there other columns nor included here, why can't !is.na() eliminate duplicate ind 6) 

one possible way non-na values out of each column grouped ind, , otherwise, leave (generate) na

full_join(targetdf, tempdf) %>%    group_by(ind) %>%    summarise_each(funs(.[!is.na(.)][1l]))  # source: local data frame [3 x 3] #  #     ind 2015 act 2016 act #   (dbl)    (int)    (int) # 1     5     7870       na # 2     6    49782      323 # 3     7       na       na 

Comments

Popular posts from this blog

javascript - jQuery: Add class depending on URL in the best way -

caching - How to check if a url path exists in the service worker cache -

Redirect to a HTTPS version using .htaccess -