Thursday, July 18, 2013

Join data frames in R (inner, outer, left, right)



I recently had 2 data frames that i wanted to join into 1 data frame based on the exact dates matched from the two frames.
df1 <- data.frame( Registrations.x = c(7,15,13,20))
df1$Dates <- c("2013-01-01", "2013-01-02", "2013-01-03", "2013-01-06")
df2 <- data.frame(Registrations.y = c(21,36,23,16,28,22))
df2$Dates <- c("2013-01-01", "2013-01-02", "2013-01-03", "2013-01-04", "2013-01-05", "2013-01-06")
Let me show you how to go about with running sql like JOINS using the MERGE function and its optional parametersin R to achieve the following -
  1. An Full join of df1 and df2
  2. An inner join of df1 and df2
  3. An outer join of df1 and df2
  4. A left outer join of df1 and df2
  5. A right outer join of df1 and df2
I

This is how i’ve used the merge function and its optional parameters:
Inner join: merge(df1, df2, by =”Dates”) NB: May leave out the “by”
Outer join: merge(x = df1, y = df2, by = "Dates", all = TRUE)
Left outer: merge(x = df1, y = df2, by = "Dates", all.x=TRUE)
Right outer: merge(x = df1, y = df2, by = "Dates", all.y=TRUE)
Cross join: merge(x = df1, y = df2, by = NULL)
It’s advisable that you explicitly state the identifiers on which you want to merge with the “by” parameter;

No comments:

Post a Comment

Add any comments if it helped :)