将具有多个观察值的行创建为列

时间:2022-04-23 19:45:30

I was looking at similar questions but I couldn't find a case similar to mine. I have a data frame that for each subject, has multiple observations per condition. It looks like this:

我正在寻找类似的问题,但我找不到类似于我的案例。我有一个数据框,每个主题,每个条件有多个观察。它看起来像这样:

subject   <- c(rep("S1",4), rep("S2",4))
condition <- rep(c(rep("a",2), rep("b",2)),2)
value     <- c(1:8)
df        <- data.frame(subject,condition,value)

df
  subject condition value
  S1         a     1
  S1         a     2
  S1         b     3
  S1         b     4
  S2         a     5
  S2         a     6
  S2         b     7
  S2         b     8

I would like to reshape it to look like this:

我想重塑它看起来像这样:

  subject condition.a condition.b
  S1           1           3
  S1           2           4
  S2           5           7
  S2           6           8

I have tried reshape and cast, but they give me an error message because there are multiple observations per subject and condition. Does anyone have suggestions on how they would do this?

我尝试过重塑和演员,但他们给我一个错误信息,因为每个主题和条件有多个观察。有没有人建议如何做到这一点?

Thanks!

谢谢!

1 个解决方案

#1


1  

The question is slightly ambiguous in the sense that it is not clear which values of a and b should be coupled with each other.

在不清楚a和b的哪些值应该相互耦合的意义上,这个问题有点含糊不清。

Assuming that you want the 1st value for (S1, a) to couple with the first (S2, b) and so on, then you can add a dummy count column which counts the various occurrences of (subject, condition) and then use the count_id to melt and dcast the data like this:

假设您希望(S1,a)的第一个值与第一个(S2,b)耦合,依此类推,那么您可以添加一个虚拟计数列来计算各种事件(主题,条件),然后使用count_id融化和dcast这样的数据:

library(plyr)
library(reshape2)

subject   <- c(rep("S1",4), rep("S2",4))
condition <- rep(c(rep("a",2), rep("b",2)),2)
value     <- c(1:8)
df        <- data.frame(subject,condition,value)

df.2 <- ddply(df, .(subject, condition), function(x) { x$count <- 1:nrow(x); x })
df.2
#   subject condition value count
# 1      S1         a     1     1
# 2      S1         a     2     2
# 3      S1         b     3     1
# 4      S1         b     4     2
# 5      S2         a     5     1
# 6      S2         a     6     2
# 7      S2         b     7     1
# 8      S2         b     8     2

df.3 <- melt(df.2, id.vars=c('subject', 'condition', 'count'))
dcast(df.3, subject + count ~ condition)
#   subject count a b
# 1      S1     1 1 3
# 2      S1     2 2 4
# 3      S2     1 5 7
# 4      S2     2 6 8

Is this what you wanted?

这是你想要的吗?

#1


1  

The question is slightly ambiguous in the sense that it is not clear which values of a and b should be coupled with each other.

在不清楚a和b的哪些值应该相互耦合的意义上,这个问题有点含糊不清。

Assuming that you want the 1st value for (S1, a) to couple with the first (S2, b) and so on, then you can add a dummy count column which counts the various occurrences of (subject, condition) and then use the count_id to melt and dcast the data like this:

假设您希望(S1,a)的第一个值与第一个(S2,b)耦合,依此类推,那么您可以添加一个虚拟计数列来计算各种事件(主题,条件),然后使用count_id融化和dcast这样的数据:

library(plyr)
library(reshape2)

subject   <- c(rep("S1",4), rep("S2",4))
condition <- rep(c(rep("a",2), rep("b",2)),2)
value     <- c(1:8)
df        <- data.frame(subject,condition,value)

df.2 <- ddply(df, .(subject, condition), function(x) { x$count <- 1:nrow(x); x })
df.2
#   subject condition value count
# 1      S1         a     1     1
# 2      S1         a     2     2
# 3      S1         b     3     1
# 4      S1         b     4     2
# 5      S2         a     5     1
# 6      S2         a     6     2
# 7      S2         b     7     1
# 8      S2         b     8     2

df.3 <- melt(df.2, id.vars=c('subject', 'condition', 'count'))
dcast(df.3, subject + count ~ condition)
#   subject count a b
# 1      S1     1 1 3
# 2      S1     2 2 4
# 3      S2     1 5 7
# 4      S2     2 6 8

Is this what you wanted?

这是你想要的吗?