对R中的多个数据集应用相同的函数

时间:2021-10-13 22:56:21

I am new to R and I have 25 samples of RNAseq results. I would like to apply the same functions to calculate correlation of my target gene (say like gene ABC) to all the 25 samples.

我是R的新手,我有25个样本的RNAseq结果。我想应用相同的函数来计算我的目标基因(比如基因ABC)与所有25个样本的相关性。

I know how to do this individually. Here is my code to do it:

我知道如何单独做这件事。下面是我的代码:

df <- read.table("Sample1.txt", header=T, sep="\t")

# gene expression values of interest
gene <-as.numeric(df["ABC",])

# correlate gene with all others genes in the expression set
correlations <- apply(df,1,function(x){cor(gene,x)})

But now I have 25 of them. I use lapply to read them all at once.

但现在我有25个。我用lapply把它们全部读完。

data <- c("Sample1.txt", "Sample2.txt",..."Sample25.txt")
df <- lapply(data, read.table)
names(df) <- data

However, I am lost on how to connect it with the rest of my code above to calculate gene correlation. I have read some of the related threads but still could not figure it out. Could anyone help me? Thanks!

然而,我迷失了如何将它与上面的其他代码联系起来计算基因的相关性。我读过一些相关的文章,但还是搞不清楚。有人能帮助我吗?谢谢!

1 个解决方案

#1


2  

You should do:

你应该做的是:

files <- c("Sample1.txt", "Sample2.txt", ..., "Sample25.txt")

myfunc <- function(file) {
  df <- read.table(file, header=TRUE, sep="\t")

  # gene expression values of interest
  gene <- as.numeric(df["ABC",])

  # correlate gene with all others genes in the expression set
  correlations <- apply(df, 1, function(x) cor(gene, x) )
}

lapply(files, myfunc)

That is the style I recommend for you. This is the style I would do:

这是我向您推荐的款式。这就是我的风格:

myfunc <- function(file) {
  df   <- read.table(file, header=TRUE, sep="\t")
  gene <- as.numeric(df["ABC",]) # gene expression values of interest
  apply(df, 1, FUN=cor, y=gene)  # correlate gene with all others
}

files <- c("Sample1.txt", "Sample2.txt", ..., "Sample25.txt")
lapply(files, myfunc)

Probably you want to save the results to an object:

可能你想把结果保存到一个对象:

L <- lapply(files, myfunc)

For the function one can even do (because cor() takes matrix arguments)):

对于函数(因为cor()取矩阵参数):

myfunc <- function(file) {
  df <- read.table(file, header=TRUE, sep="\t")
  cor(t(df), y=as.numeric(df["ABC",])) # correlate gene with all others
}

#1


2  

You should do:

你应该做的是:

files <- c("Sample1.txt", "Sample2.txt", ..., "Sample25.txt")

myfunc <- function(file) {
  df <- read.table(file, header=TRUE, sep="\t")

  # gene expression values of interest
  gene <- as.numeric(df["ABC",])

  # correlate gene with all others genes in the expression set
  correlations <- apply(df, 1, function(x) cor(gene, x) )
}

lapply(files, myfunc)

That is the style I recommend for you. This is the style I would do:

这是我向您推荐的款式。这就是我的风格:

myfunc <- function(file) {
  df   <- read.table(file, header=TRUE, sep="\t")
  gene <- as.numeric(df["ABC",]) # gene expression values of interest
  apply(df, 1, FUN=cor, y=gene)  # correlate gene with all others
}

files <- c("Sample1.txt", "Sample2.txt", ..., "Sample25.txt")
lapply(files, myfunc)

Probably you want to save the results to an object:

可能你想把结果保存到一个对象:

L <- lapply(files, myfunc)

For the function one can even do (because cor() takes matrix arguments)):

对于函数(因为cor()取矩阵参数):

myfunc <- function(file) {
  df <- read.table(file, header=TRUE, sep="\t")
  cor(t(df), y=as.numeric(df["ABC",])) # correlate gene with all others
}