如何重塑R中的以下数据帧(How to reshape the following dataframe in R)

我有以下数据帧:

原版的:

ID C1 C2 C3 C4 C5 C6 C7 C8 A11 0 1 0 0 0 0 1 0 A21 0 0 1 1 0 0 0 0 A31 0 0 0 0 1 0 1 0 A41 0 0 0 0 0 1 0 0 A51 0 0 0 0 0 1 0 0 A61 0 0 0 0 0 1 0 1 A71 0 0 1 1 0 0 0 0 A81 1 0 0 1 0 0 1 0 A91 0 1 0 1 0 0 0 1 A10 1 0 1 0 0 1 0 1

我最终希望得到以下格式的数据:

最后:

A11 C2 C7 A21 C3 C4 A31 C5 C7 A41 C6 A51 C6 A61 C6 C8 A71 C3 C4 A81 C1 C4 C7 A91 C2 C4 C8 A10 C1 C3 C6 C8

基本上,只要值!= 0,将该值替换为该列中变量的名称。 有没有办法在R中执行上述操作?

谢谢!

I have the following dataframe:

Original:

ID C1 C2 C3 C4 C5 C6 C7 C8 A11 0 1 0 0 0 0 1 0 A21 0 0 1 1 0 0 0 0 A31 0 0 0 0 1 0 1 0 A41 0 0 0 0 0 1 0 0 A51 0 0 0 0 0 1 0 0 A61 0 0 0 0 0 1 0 1 A71 0 0 1 1 0 0 0 0 A81 1 0 0 1 0 0 1 0 A91 0 1 0 1 0 0 0 1 A10 1 0 1 0 0 1 0 1

I would ultimately like to have the data in the following format:

Final:

A11 C2 C7 A21 C3 C4 A31 C5 C7 A41 C6 A51 C6 A61 C6 C8 A71 C3 C4 A81 C1 C4 C7 A91 C2 C4 C8 A10 C1 C3 C6 C8

So essentially, wherever the value != 0, replace that value with the name of the variable in that column. Is there a way to do the above in R?

Thank you!

最满意答案

这是一个使用apply的方法,它返回一个列表,其中列表项名称是行名:

# construct reproducible example set.seed(1234) df <- data.frame(apple=sample(c(0,1), 10, replace=T), banana=sample(c(0,1), 10, replace=T), carrot=sample(c(0,1), 10, replace=T)) # give it some row names rownames(df) <- letters[1:10] # return the list myList <- apply(df, 1, function(i) names(df)[i!=0])

使用此方法时,您需要确保数据中存在足够的变化。 这是因为apply (和许多R函数一样)试图简化输出的数据类型。 @digemall提供的示例,

df <- structure(list(ID = c("A11", "A21", "A31", "A41", "A51", "A61" ), C1 = c(1, 1, 1, 1, 1, 1), C2 = c(0, 0, 0, 0, 0, 0)), .Names = c("ID", "C1", "C2"), row.names = c(NA, 6L), class = "data.frame")

返回一个矩阵,它有用,因为它提供了所需的信息,但不是预期的列表类型对象。 一个更加阴险的例子如下:

df <- data.frame(apple=c(0,1), banana=c(1,0))

方法将返回无用的字符向量。

一个更安全的方法,@ digemall建议使用lapply循环行。 因为lapply总是返回一个列表,所以我们不必担心以前的任何一个问题:

myList <- lapply(1:nrow(df),function(i)names(df)[df[i,]==1])

现在我们必须添加名称:

names(res) <- row.names(df)

Here is a method using apply that returns a list where the list item names are the row names:

# construct reproducible example set.seed(1234) df <- data.frame(apple=sample(c(0,1), 10, replace=T), banana=sample(c(0,1), 10, replace=T), carrot=sample(c(0,1), 10, replace=T)) # give it some row names rownames(df) <- letters[1:10] # return the list myList <- apply(df, 1, function(i) names(df)[i!=0])

When using this method, you want to be sure that there is sufficient variation in your data. This is because apply (as do many R functions) tries to simplify the datatype of the output. The example that @digemall provides,

df <- structure(list(ID = c("A11", "A21", "A31", "A41", "A51", "A61" ), C1 = c(1, 1, 1, 1, 1, 1), C2 = c(0, 0, 0, 0, 0, 0)), .Names = c("ID", "C1", "C2"), row.names = c(NA, 6L), class = "data.frame")

returns a matrix, which is useful in that it provides the desired information, but is not the list type object that was expected. An even more insidious example is the following:

df <- data.frame(apple=c(0,1), banana=c(1,0))

where the method will return a useless character vector.

A safer method, that @digemall suggests is to use lapply to loop down the rows. Because lapply always returns a list, we don't have to worry about either of the previous concerns:

myList <- lapply(1:nrow(df),function(i)names(df)[df[i,]==1])

Now we have to add back the names:

names(res) <- row.names(df)

更多推荐