你在R中使用了什么最佳实践?

时间:2020-12-24 20:03:38

What are some good practices for programming in R?

在R中有什么好的编程实践吗?

Since R is a special-purpose language that I don't use all the time, I typically just hack together some quick scripts that do what I need.

由于R是一种特殊用途的语言,我不会一直使用它,所以我通常只需要拼凑一些快速脚本就可以完成我需要的工作。

But what are some tips for writing clean and efficient R code?

但是有什么技巧可以帮助您编写干净、高效的R代码呢?

5 个解决方案

#1


20  

You already provide some hints by stating your approach is 'hack quick scripts'. If you want best practices and structure, simple follow the established best practices from CRAN:

您已经提供了一些提示,说明您的方法是“hack quick scripts”。如果您想要最佳实践和结构,请简单地遵循CRAN所建立的最佳实践:

  • create a package, this opens the door to running R CMD check which is very useful
  • 创建一个包,这打开了运行rcmd检查的门,这是非常有用的。
  • as many people have stated, having a package helps you in the code writing stage too as you are somewhat forced to document the code; that is a Good Thing (TM)
  • 正如许多人所说,拥有一个包也可以帮助您在代码编写阶段,因为您在某种程度上*编写代码;这是好事(TM)
  • once you have a package, add code in the \examples{} section of the documentation as this will be running during R CMD check and provides an easy entry to regression testing
  • 一旦您有了一个包,就在文档的\example{}部分中添加代码,因为这将在rcmd检查期间运行,并提供了一个简单的回归测试条目
  • once you get used to regression testing, start to use a package such as RUnit; that really is best practices
  • 一旦您习惯了回归测试,就开始使用一个包,例如RUnit;这是最好的做法。
  • JD's pointer to the Google Style Guide is a good one too. That isn't the only style guide as e.g. Henrik's R Coding Convention precedes it by a few years; and there is also Hadley's riff on Google's style guide
  • JD对于谷歌风格指南的指针也是一个不错的选择。这并不是唯一的风格指南,例如Henrik的R编码约定先于它几年;还有哈德利在谷歌的风格指南上的即兴表演
  • Otherwise, the oldie-but-goldie 'do what your colleagues and coauthors do' also applies
  • 否则,“做你的同事和合著者所做的事”也同样适用

#2


14  

I recommend Josh Reich's Load, Clean, Func, Do workflow from this previous question.

我推荐Josh Reich的Load, Clean, Func,做工作流。

In addition I recommend following coding guidelines such as Google's R Style Guide. Using a coding style guide makes reading the code later so much easier.

此外,我建议遵循以下编码指南,如谷歌的R风格指南。使用编码风格指南可以使以后阅读代码更加容易。

#3


6  

I completely agree with the existing answers, especially regarding the usage of packages. Packages require a lot of discipline, documentation, and structure, which really help to enforce best practices (along with R CMD CHECK). You can also use the codetools package to help with this. Use the roxygen package for documentation.

我完全同意现有的答案,特别是关于包的使用。包需要大量的规程、文档和结构,这有助于执行最佳实践(以及R CMD检查)。您还可以使用codetools包来帮助实现这一点。在文档中使用roxygen包。

Beyond that, I recommend that you not only vectorize your code, but more particularly, make every effort to vectorize your functions, meaning that you should be able to provide vector arguments and get vectors returned (even from things like database calls). That will really improve your code efficiency and clarity in the long run.

除此之外,我建议您不仅要对代码进行矢量化,更特别的是,要尽一切努力对函数进行矢量化,这意味着您应该能够提供向量参数并返回向量(甚至从数据库调用中)。从长远来看,这将真正提高代码的效率和清晰度。

Lastly, I really like to use something like Sweave to organize my code into clear literate reproducible research whenever writing a report. Along with this I recommend using the cache package.

最后,每当我写报告时,我都喜欢使用Sweave之类的工具来组织我的代码,使其成为清晰的、有文化的、可重复的研究。此外,我建议使用缓存包。

#4


2  

For efficiency, prefer vector operations over for loops.

为了提高效率,更喜欢向量运算而不是循环操作。

#5


1  

This is good programming practice in general, but use a version control system such as SVN manage your code.

这是一种良好的编程实践,但是使用版本控制系统,如SVN管理代码。

#1


20  

You already provide some hints by stating your approach is 'hack quick scripts'. If you want best practices and structure, simple follow the established best practices from CRAN:

您已经提供了一些提示,说明您的方法是“hack quick scripts”。如果您想要最佳实践和结构,请简单地遵循CRAN所建立的最佳实践:

  • create a package, this opens the door to running R CMD check which is very useful
  • 创建一个包,这打开了运行rcmd检查的门,这是非常有用的。
  • as many people have stated, having a package helps you in the code writing stage too as you are somewhat forced to document the code; that is a Good Thing (TM)
  • 正如许多人所说,拥有一个包也可以帮助您在代码编写阶段,因为您在某种程度上*编写代码;这是好事(TM)
  • once you have a package, add code in the \examples{} section of the documentation as this will be running during R CMD check and provides an easy entry to regression testing
  • 一旦您有了一个包,就在文档的\example{}部分中添加代码,因为这将在rcmd检查期间运行,并提供了一个简单的回归测试条目
  • once you get used to regression testing, start to use a package such as RUnit; that really is best practices
  • 一旦您习惯了回归测试,就开始使用一个包,例如RUnit;这是最好的做法。
  • JD's pointer to the Google Style Guide is a good one too. That isn't the only style guide as e.g. Henrik's R Coding Convention precedes it by a few years; and there is also Hadley's riff on Google's style guide
  • JD对于谷歌风格指南的指针也是一个不错的选择。这并不是唯一的风格指南,例如Henrik的R编码约定先于它几年;还有哈德利在谷歌的风格指南上的即兴表演
  • Otherwise, the oldie-but-goldie 'do what your colleagues and coauthors do' also applies
  • 否则,“做你的同事和合著者所做的事”也同样适用

#2


14  

I recommend Josh Reich's Load, Clean, Func, Do workflow from this previous question.

我推荐Josh Reich的Load, Clean, Func,做工作流。

In addition I recommend following coding guidelines such as Google's R Style Guide. Using a coding style guide makes reading the code later so much easier.

此外,我建议遵循以下编码指南,如谷歌的R风格指南。使用编码风格指南可以使以后阅读代码更加容易。

#3


6  

I completely agree with the existing answers, especially regarding the usage of packages. Packages require a lot of discipline, documentation, and structure, which really help to enforce best practices (along with R CMD CHECK). You can also use the codetools package to help with this. Use the roxygen package for documentation.

我完全同意现有的答案,特别是关于包的使用。包需要大量的规程、文档和结构,这有助于执行最佳实践(以及R CMD检查)。您还可以使用codetools包来帮助实现这一点。在文档中使用roxygen包。

Beyond that, I recommend that you not only vectorize your code, but more particularly, make every effort to vectorize your functions, meaning that you should be able to provide vector arguments and get vectors returned (even from things like database calls). That will really improve your code efficiency and clarity in the long run.

除此之外,我建议您不仅要对代码进行矢量化,更特别的是,要尽一切努力对函数进行矢量化,这意味着您应该能够提供向量参数并返回向量(甚至从数据库调用中)。从长远来看,这将真正提高代码的效率和清晰度。

Lastly, I really like to use something like Sweave to organize my code into clear literate reproducible research whenever writing a report. Along with this I recommend using the cache package.

最后,每当我写报告时,我都喜欢使用Sweave之类的工具来组织我的代码,使其成为清晰的、有文化的、可重复的研究。此外,我建议使用缓存包。

#4


2  

For efficiency, prefer vector operations over for loops.

为了提高效率,更喜欢向量运算而不是循环操作。

#5


1  

This is good programming practice in general, but use a version control system such as SVN manage your code.

这是一种良好的编程实践,但是使用版本控制系统,如SVN管理代码。