Trying to use a for loop that cycles through a list of dataframes to compute various summaries

Question

Say I have a big dataset called fulldata with 3 columns called a, b and c as such:

df1 <- data.frame(
  a = rnorm(4, 10), b = rnorm(4, 6), c = 1
  )

df2 <- data.frame(
  a = rnorm(6, 10), b = rnorm(6, 9), c = 2
  )

df3 <- data.frame(
  a = rnorm(8, 8), b = rnorm(8, 9), c = 3
  )

fulldata <- rbind(df1, df2)
fulldata <- rbind(fulldata, df3)

And I also have subsets based on the value of c such that df1 are rows where c = 1 … and so on 3. I have vectors referencing these subsets and column names as such.

c_values <- c("df1", "df2", "df3")
columns <- c("a", "b", "c")

Essentially I want to create 5 number summary tables for each column a to c and each subset, like you would get with summary(x) + a mean with columns for min, q1, median, etc all the way to mean. Another column indicating the c value for the subset and also another column indicating which column (a, b or c) is being summarised. Finally one where there is no subset based on c value but the fulldata.

Edit: I’m sorry I tried to format this table into the question but it didn’t appear after I posted it

for (column in columns) {
  summary1 <- c(summary(df1$columns), mean(df3$columns))
  summary2 <- c(summary(df2$columns), mean(df3$columns))
  summary3 <- c(summary(df3$columns), mean(df3$columns))
} #and then bind the summaries together somehow

for (c_value in values) {
  for(column in columns)
} #bind the whole table together

In practice, there are far more than 3 subsets and 3 column names and so I want to be able to cycle through them with a quick loop, hence the vector names from before but I cant seem to get the syntax to work.

Leave a Comment Cancel reply