为什么要使用Python的os模块方法而不是直接执行shell命令呢?

时间:2021-09-19 16:54:00

I am trying to understand what is the motivation behind using Python's library functions for executing OS-specific tasks such as creating files/directories, changing file attributes, etc. instead of just executing those commands via os.system() or subprocess.call()?

我试图理解为什么要使用Python的库函数来执行特定于操作系统的任务,比如创建文件/目录、更改文件属性等等,而不是仅仅通过os.system()或subprocess.call()执行这些命令?

For example, why would I want to use os.chmod instead of doing os.system("chmod...")?

例如,我为什么要使用os。chmod而不是os.system(“chmod…”)?

I understand that it is more "pythonic" to use Python's available library methods as much as possible instead of just executing shell commands directly. But, is there any other motivation behind doing this from a functionality point of view?

我知道尽可能多地使用Python的可用库方法而不是直接执行shell命令更符合“Python”。但是,从功能的角度来看,这样做还有其他动机吗?

I am only talking about executing simple one-line shell commands here. When we need more control over the execution of the task, I understand that using subprocess module makes more sense, for example.

这里我只讨论执行简单的单行shell命令。当我们需要对任务执行进行更多的控制时,我理解使用子流程模块更有意义,例如。

6 个解决方案

#1


322  

  1. It's faster, os.system and subprocess.call create new processes which is unnecessary for something this simple. In fact, os.system and subprocess.call with the shell argument usually create at least two new processes: the first one being the shell, and the second one being the command that you're running (if it's not a shell built-in like test).

    它更快,操作系统。系统和子流程。调用create新进程,这对于如此简单的事情来说是不必要的。事实上,操作系统。系统和子流程。使用shell参数调用通常会创建至少两个新进程:第一个是shell,第二个是正在运行的命令(如果不是shell内置的测试)。

  2. Some commands are useless in a separate process. For example, if you run os.spawn("cd dir/"), it will change the current working directory of the child process, but not of the Python process. You need to use os.chdir for that.

    有些命令在单独的进程中是无用的。例如,如果运行os。衍生(“cd目录/”),它将更改子进程的当前工作目录,但不会更改Python进程的当前工作目录。你需要使用操作系统。目录。

  3. You don't have to worry about special characters interpreted by the shell. os.chmod(path, mode) will work no matter what the filename is, whereas os.spawn("chmod 777 " + path) will fail horribly if the filename is something like ; rm -rf ~. (Note that you can work around this if you use subprocess.call without the shell argument.)

    您不必担心shell解释的特殊字符。操作系统。无论文件名是什么,chmod(路径,模式)都可以工作,而os。如果文件名类似,则衍生(“chmod 777”+ path)将会严重失败;rm射频~。(注意,如果您使用子过程,您可以围绕这个问题进行工作。调用没有shell参数。

  4. You don't have to worry about filenames that begin with a dash. os.chmod("--quiet", mode) will change the permissions of the file named --quiet, but os.spawn("chmod 777 --quiet") will fail, as --quiet is interpreted as an argument. This is true even for subprocess.call(["chmod", "777", "--quiet"]).

    你不必担心以破折号开头的文件名。操作系统。chmod(“—quiet”,mode)将更改命名为“—quiet, but os”的文件的权限。spawn(“chmod 777——quiet”)将会失败,因为“quiet”被解释为参数。即使对于子进程也是如此。调用([" chmod "," 777 ","——安静"])。

  5. You have fewer cross-platform and cross-shell concerns, as Python's standard library is supposed to deal with that for you. Does your system have chmod command? Is it installed? Does it support the parameters that you expect it to support? The os module will try to be as cross-platform as possible and documents when that it's not possible.

    您有更少的跨平台和跨shell的关注,因为Python的标准库应该为您处理这些问题。你的系统有chmod命令吗?安装了吗?它是否支持您期望它支持的参数?操作系统模块将尽可能多地跨平台,并在不可能时记录文档。

  6. If the command you're running has output that you care about, you need to parse it, which is trickier than it sounds, as you may forget about corner-cases (filenames with spaces, tabs and newlines in them), even when you don't care about portability.

    如果您正在运行的命令有您关心的输出,那么您需要对它进行解析,这比它听起来要复杂,因为您可能会忘记一些基本情况(包含空格、制表符和换行符的文件名),即使您不关心可移植性。

#2


132  

It is safer. To give you an idea here is an example script

它是更安全。下面是一个示例脚本

import os
file = raw_input("Please enter a file: ")
os.system("chmod 777 " + file)

If the input from the user was test; rm -rf ~ this would then delete the home directory.

如果用户的输入是测试;rm -rf ~这将删除主目录。

This is why it is safer to use the built in function.

这就是为什么使用内置函数更安全。

Hence why you should use subprocess instead of system too.

因此,为什么应该使用子进程而不是系统。

#3


59  

There are four strong cases for preferring Python's more-specific methods in the os module over using os.system or the subprocess module when executing a command:

在操作系统模块中使用Python的更特定的方法而不是使用os有四个很好的例子。执行命令时系统或子流程模块:

  • Redundancy - spawning another process is redundant and wastes time and resources.
  • 冗余——生成另一个进程是冗余的,浪费时间和资源。
  • Portability - Many of the methods in the os module are available in multiple platforms while many shell commands are os-specific.
  • 可移植性——操作系统模块中的许多方法可以在多个平台上使用,而许多shell命令是特定于操作系统的。
  • Understanding the results - Spawning a process to execute arbitrary commands forces you to parse the results from the output and understand if and why a command has done something wrong.
  • 理解结果——生成执行任意命令的进程将迫使您解析输出的结果,并理解命令是否和为什么做了错误的操作。
  • Safety - A process can potentially execute any command it's given. This is a weak design and it can be avoided by using specific methods in the os module.
  • 安全性——进程可以执行它所给出的任何命令。这是一个薄弱的设计,可以通过在操作系统模块中使用特定的方法来避免。

Redundancy (see redundant code):

You're actually executing a redundant "middle-man" on your way to the eventual system calls (chmod in your example). This middle man is a new process or sub-shell.

实际上,您正在执行一个冗余的“中间人”,以执行最终的系统调用(在您的示例中是chmod)。这个中间的人是一个新的过程或子层。

From os.system:

从os.system:

Execute the command (a string) in a subshell ...

在子shell中执行命令(字符串)……

And subprocess is just a module to spawn new processes.

子进程只是生成新进程的一个模块。

You can do what you need without spawning these processes.

您可以在不产生这些过程的情况下完成所需的工作。

Portability (see source code portability):

The os module's aim is to provide generic operating-system services and it's description starts with:

os模块的目标是提供通用的操作系统服务,其描述以:

This module provides a portable way of using operating system dependent functionality.

该模块提供了一种使用操作系统相关功能的可移植方式。

You can use os.listdir on both windows and unix. Trying to use os.system / subprocess for this functionality will force you to maintain two calls (for ls / dir) and check what operating system you're on. This is not as portable and will cause even more frustration later on (see Handling Output).

您可以使用操作系统。在windows和unix上都有listdir。尝试使用操作系统。这个功能的系统/子进程将迫使您维护两个调用(对于ls / dir)并检查您正在使用的操作系统。这并不是可移植性的,以后会导致更大的挫折(参见处理输出)。

Understanding the command's results:

Suppose you want to list the files in a directory.

假设您希望将文件列在一个目录中。

If you're using os.system("ls") / subprocess.call(['ls']), you can only get the process's output back, which is basically a big string with the file names.

如果您正在使用os.system(“ls”)/子进程.call(['ls']),您只能获得进程的输出,这基本上是一个带有文件名的大字符串。

How can you tell a file with a space in it's name from two files?

如何用两个文件的名称来告诉一个文件有空格?

What if you have no permission to list the files?

如果你没有权限列出这些文件怎么办?

How should you map the data to python objects?

如何将数据映射到python对象?

These are only off the top of my head, and while there are solutions to these problems - why solve again a problem that was solved for you?

这些都是我脑子里想出来的,虽然这些问题都有解决的办法——为什么要再解决一个你已经解决了的问题呢?

This is an example of following the Don't Repeat Yourself principle (Often reffered to as "DRY") by not repeating an implementation that already exists and is freely available for you.

这是一个遵循不要重复自己原则的例子(通常被认为是“干的”),不要重复已经存在的实现,并且可以免费为您提供。

Safety:

os.system and subprocess are powerful. It's good when you need this power, but it's dangerous when you don't. When you use os.listdir, you know it can not do anything else other then list files or raise an error. When you use os.system or subprocess to achieve the same behaviour you can potentially end up doing something you did not mean to do.

操作系统。系统和子进程非常强大。当你需要这种力量的时候是好的,但是当你不需要的时候是危险的。当您使用的操作系统。listdir,你知道它除了列出文件或引发错误之外不能做其他任何事情。当您使用的操作系统。实现相同行为的系统或子进程,最终可能会做一些您不想做的事情。

Injection Safety (see shell injection examples):

注射安全性(见shell注入示例):

If you use input from the user as a new command you've basically given him a shell. This is much like SQL injection providing a shell in the DB for the user.

如果您使用用户的输入作为一个新的命令,那么您基本上已经给了他一个shell。这很像SQL注入,为用户提供DB中的shell。

An example would be a command of the form:

一个例子是表单的命令:

# ... read some user input
os.system(user_input + " some continutation")

This can be easily exploited to run any arbitrary code using the input: NASTY COMMAND;# to create the eventual:

这可以很容易地利用输入来运行任意代码:糟糕的命令;#创建最终的:

os.system("NASTY COMMAND; # some continuation")

There are many such commands that can put your system at risk.

有许多这样的命令会使您的系统处于危险之中。

#4


22  

For a simple reason - when you call a shell function, it creates a sub-shell which is destroyed after your command exists, so if you change directory in a shell - it does not affect your environment in Python.

出于一个简单的原因—当您调用shell函数时,它会创建一个子shell,该子shell在您的命令存在后被销毁,因此如果您在shell中更改目录—它不会影响您在Python中的环境。

Besides, creating sub-shell is time consuming, so using OS commands directly will impact your performance

此外,创建子shell非常耗时,因此直接使用OS命令将影响性能

EDIT

编辑

I had some timing tests running:

我进行了一些计时测试:

In [379]: %timeit os.chmod('Documents/recipes.txt', 0755)
10000 loops, best of 3: 215 us per loop

In [380]: %timeit os.system('chmod 0755 Documents/recipes.txt')
100 loops, best of 3: 2.47 ms per loop

In [382]: %timeit call(['chmod', '0755', 'Documents/recipes.txt'])
100 loops, best of 3: 2.93 ms per loop

Internal function runs more than 10 time faster

内部功能运行速度快10倍以上

EDIT2

EDIT2

There may be cases when invoking external executable may yield better results than Python packages - I just remembered a mail sent by a colleague of mine that performance of gzip called through subprocess was much higher than the performance of a Python package he used. But certainly not when we are talking about standard OS packages emulating standard OS commands

有时调用外部可执行文件可能会产生比Python包更好的结果——我刚刚记得我的一个同事发送的邮件,通过子进程调用gzip的性能要比他使用的Python包的性能高得多。但是,当我们讨论模拟标准OS命令的标准OS包时,当然不是这样

#5


16  

Shell call are OS specific whereas Python os module functions are not, in most of the case. And it avoid spawning a subprocess.

Shell调用是特定于OS的,而在大多数情况下,Python OS模块函数不是。它避免生成子进程。

#6


11  

It's far more efficient. The "shell" is just another OS binary which contains a lot of system calls. Why incur the overhead of creating the whole shell process just for that single system call?

高效得多。“shell”只是另一个OS二进制文件,其中包含许多系统调用。为什么只为单个系统调用创建整个shell进程会产生开销呢?

The situation is even worse when you use os.system for something that's not a shell built-in. You start a shell process which in turn starts an executable which then (two processes away) makes the system call. At least subprocess would have removed the need for a shell intermediary process.

当您使用os时,情况甚至更糟。不是内置shell的系统。启动shell进程,shell进程反过来启动可执行文件,然后(两个进程之间)进行系统调用。至少,子流程可以消除shell中介过程的需要。

It's not specific to Python, this. systemd is such an improvement to Linux startup times for the same reason: it makes the necessary system calls itself instead of spawning a thousand shells.

这不是Python特有的。systemd对Linux启动时间的改进是出于同样的原因:它使必要的系统调用本身而不是生成上千个shell。

#1


322  

  1. It's faster, os.system and subprocess.call create new processes which is unnecessary for something this simple. In fact, os.system and subprocess.call with the shell argument usually create at least two new processes: the first one being the shell, and the second one being the command that you're running (if it's not a shell built-in like test).

    它更快,操作系统。系统和子流程。调用create新进程,这对于如此简单的事情来说是不必要的。事实上,操作系统。系统和子流程。使用shell参数调用通常会创建至少两个新进程:第一个是shell,第二个是正在运行的命令(如果不是shell内置的测试)。

  2. Some commands are useless in a separate process. For example, if you run os.spawn("cd dir/"), it will change the current working directory of the child process, but not of the Python process. You need to use os.chdir for that.

    有些命令在单独的进程中是无用的。例如,如果运行os。衍生(“cd目录/”),它将更改子进程的当前工作目录,但不会更改Python进程的当前工作目录。你需要使用操作系统。目录。

  3. You don't have to worry about special characters interpreted by the shell. os.chmod(path, mode) will work no matter what the filename is, whereas os.spawn("chmod 777 " + path) will fail horribly if the filename is something like ; rm -rf ~. (Note that you can work around this if you use subprocess.call without the shell argument.)

    您不必担心shell解释的特殊字符。操作系统。无论文件名是什么,chmod(路径,模式)都可以工作,而os。如果文件名类似,则衍生(“chmod 777”+ path)将会严重失败;rm射频~。(注意,如果您使用子过程,您可以围绕这个问题进行工作。调用没有shell参数。

  4. You don't have to worry about filenames that begin with a dash. os.chmod("--quiet", mode) will change the permissions of the file named --quiet, but os.spawn("chmod 777 --quiet") will fail, as --quiet is interpreted as an argument. This is true even for subprocess.call(["chmod", "777", "--quiet"]).

    你不必担心以破折号开头的文件名。操作系统。chmod(“—quiet”,mode)将更改命名为“—quiet, but os”的文件的权限。spawn(“chmod 777——quiet”)将会失败,因为“quiet”被解释为参数。即使对于子进程也是如此。调用([" chmod "," 777 ","——安静"])。

  5. You have fewer cross-platform and cross-shell concerns, as Python's standard library is supposed to deal with that for you. Does your system have chmod command? Is it installed? Does it support the parameters that you expect it to support? The os module will try to be as cross-platform as possible and documents when that it's not possible.

    您有更少的跨平台和跨shell的关注,因为Python的标准库应该为您处理这些问题。你的系统有chmod命令吗?安装了吗?它是否支持您期望它支持的参数?操作系统模块将尽可能多地跨平台,并在不可能时记录文档。

  6. If the command you're running has output that you care about, you need to parse it, which is trickier than it sounds, as you may forget about corner-cases (filenames with spaces, tabs and newlines in them), even when you don't care about portability.

    如果您正在运行的命令有您关心的输出,那么您需要对它进行解析,这比它听起来要复杂,因为您可能会忘记一些基本情况(包含空格、制表符和换行符的文件名),即使您不关心可移植性。

#2


132  

It is safer. To give you an idea here is an example script

它是更安全。下面是一个示例脚本

import os
file = raw_input("Please enter a file: ")
os.system("chmod 777 " + file)

If the input from the user was test; rm -rf ~ this would then delete the home directory.

如果用户的输入是测试;rm -rf ~这将删除主目录。

This is why it is safer to use the built in function.

这就是为什么使用内置函数更安全。

Hence why you should use subprocess instead of system too.

因此,为什么应该使用子进程而不是系统。

#3


59  

There are four strong cases for preferring Python's more-specific methods in the os module over using os.system or the subprocess module when executing a command:

在操作系统模块中使用Python的更特定的方法而不是使用os有四个很好的例子。执行命令时系统或子流程模块:

  • Redundancy - spawning another process is redundant and wastes time and resources.
  • 冗余——生成另一个进程是冗余的,浪费时间和资源。
  • Portability - Many of the methods in the os module are available in multiple platforms while many shell commands are os-specific.
  • 可移植性——操作系统模块中的许多方法可以在多个平台上使用,而许多shell命令是特定于操作系统的。
  • Understanding the results - Spawning a process to execute arbitrary commands forces you to parse the results from the output and understand if and why a command has done something wrong.
  • 理解结果——生成执行任意命令的进程将迫使您解析输出的结果,并理解命令是否和为什么做了错误的操作。
  • Safety - A process can potentially execute any command it's given. This is a weak design and it can be avoided by using specific methods in the os module.
  • 安全性——进程可以执行它所给出的任何命令。这是一个薄弱的设计,可以通过在操作系统模块中使用特定的方法来避免。

Redundancy (see redundant code):

You're actually executing a redundant "middle-man" on your way to the eventual system calls (chmod in your example). This middle man is a new process or sub-shell.

实际上,您正在执行一个冗余的“中间人”,以执行最终的系统调用(在您的示例中是chmod)。这个中间的人是一个新的过程或子层。

From os.system:

从os.system:

Execute the command (a string) in a subshell ...

在子shell中执行命令(字符串)……

And subprocess is just a module to spawn new processes.

子进程只是生成新进程的一个模块。

You can do what you need without spawning these processes.

您可以在不产生这些过程的情况下完成所需的工作。

Portability (see source code portability):

The os module's aim is to provide generic operating-system services and it's description starts with:

os模块的目标是提供通用的操作系统服务,其描述以:

This module provides a portable way of using operating system dependent functionality.

该模块提供了一种使用操作系统相关功能的可移植方式。

You can use os.listdir on both windows and unix. Trying to use os.system / subprocess for this functionality will force you to maintain two calls (for ls / dir) and check what operating system you're on. This is not as portable and will cause even more frustration later on (see Handling Output).

您可以使用操作系统。在windows和unix上都有listdir。尝试使用操作系统。这个功能的系统/子进程将迫使您维护两个调用(对于ls / dir)并检查您正在使用的操作系统。这并不是可移植性的,以后会导致更大的挫折(参见处理输出)。

Understanding the command's results:

Suppose you want to list the files in a directory.

假设您希望将文件列在一个目录中。

If you're using os.system("ls") / subprocess.call(['ls']), you can only get the process's output back, which is basically a big string with the file names.

如果您正在使用os.system(“ls”)/子进程.call(['ls']),您只能获得进程的输出,这基本上是一个带有文件名的大字符串。

How can you tell a file with a space in it's name from two files?

如何用两个文件的名称来告诉一个文件有空格?

What if you have no permission to list the files?

如果你没有权限列出这些文件怎么办?

How should you map the data to python objects?

如何将数据映射到python对象?

These are only off the top of my head, and while there are solutions to these problems - why solve again a problem that was solved for you?

这些都是我脑子里想出来的,虽然这些问题都有解决的办法——为什么要再解决一个你已经解决了的问题呢?

This is an example of following the Don't Repeat Yourself principle (Often reffered to as "DRY") by not repeating an implementation that already exists and is freely available for you.

这是一个遵循不要重复自己原则的例子(通常被认为是“干的”),不要重复已经存在的实现,并且可以免费为您提供。

Safety:

os.system and subprocess are powerful. It's good when you need this power, but it's dangerous when you don't. When you use os.listdir, you know it can not do anything else other then list files or raise an error. When you use os.system or subprocess to achieve the same behaviour you can potentially end up doing something you did not mean to do.

操作系统。系统和子进程非常强大。当你需要这种力量的时候是好的,但是当你不需要的时候是危险的。当您使用的操作系统。listdir,你知道它除了列出文件或引发错误之外不能做其他任何事情。当您使用的操作系统。实现相同行为的系统或子进程,最终可能会做一些您不想做的事情。

Injection Safety (see shell injection examples):

注射安全性(见shell注入示例):

If you use input from the user as a new command you've basically given him a shell. This is much like SQL injection providing a shell in the DB for the user.

如果您使用用户的输入作为一个新的命令,那么您基本上已经给了他一个shell。这很像SQL注入,为用户提供DB中的shell。

An example would be a command of the form:

一个例子是表单的命令:

# ... read some user input
os.system(user_input + " some continutation")

This can be easily exploited to run any arbitrary code using the input: NASTY COMMAND;# to create the eventual:

这可以很容易地利用输入来运行任意代码:糟糕的命令;#创建最终的:

os.system("NASTY COMMAND; # some continuation")

There are many such commands that can put your system at risk.

有许多这样的命令会使您的系统处于危险之中。

#4


22  

For a simple reason - when you call a shell function, it creates a sub-shell which is destroyed after your command exists, so if you change directory in a shell - it does not affect your environment in Python.

出于一个简单的原因—当您调用shell函数时,它会创建一个子shell,该子shell在您的命令存在后被销毁,因此如果您在shell中更改目录—它不会影响您在Python中的环境。

Besides, creating sub-shell is time consuming, so using OS commands directly will impact your performance

此外,创建子shell非常耗时,因此直接使用OS命令将影响性能

EDIT

编辑

I had some timing tests running:

我进行了一些计时测试:

In [379]: %timeit os.chmod('Documents/recipes.txt', 0755)
10000 loops, best of 3: 215 us per loop

In [380]: %timeit os.system('chmod 0755 Documents/recipes.txt')
100 loops, best of 3: 2.47 ms per loop

In [382]: %timeit call(['chmod', '0755', 'Documents/recipes.txt'])
100 loops, best of 3: 2.93 ms per loop

Internal function runs more than 10 time faster

内部功能运行速度快10倍以上

EDIT2

EDIT2

There may be cases when invoking external executable may yield better results than Python packages - I just remembered a mail sent by a colleague of mine that performance of gzip called through subprocess was much higher than the performance of a Python package he used. But certainly not when we are talking about standard OS packages emulating standard OS commands

有时调用外部可执行文件可能会产生比Python包更好的结果——我刚刚记得我的一个同事发送的邮件,通过子进程调用gzip的性能要比他使用的Python包的性能高得多。但是,当我们讨论模拟标准OS命令的标准OS包时,当然不是这样

#5


16  

Shell call are OS specific whereas Python os module functions are not, in most of the case. And it avoid spawning a subprocess.

Shell调用是特定于OS的,而在大多数情况下,Python OS模块函数不是。它避免生成子进程。

#6


11  

It's far more efficient. The "shell" is just another OS binary which contains a lot of system calls. Why incur the overhead of creating the whole shell process just for that single system call?

高效得多。“shell”只是另一个OS二进制文件,其中包含许多系统调用。为什么只为单个系统调用创建整个shell进程会产生开销呢?

The situation is even worse when you use os.system for something that's not a shell built-in. You start a shell process which in turn starts an executable which then (two processes away) makes the system call. At least subprocess would have removed the need for a shell intermediary process.

当您使用os时,情况甚至更糟。不是内置shell的系统。启动shell进程,shell进程反过来启动可执行文件,然后(两个进程之间)进行系统调用。至少,子流程可以消除shell中介过程的需要。

It's not specific to Python, this. systemd is such an improvement to Linux startup times for the same reason: it makes the necessary system calls itself instead of spawning a thousand shells.

这不是Python特有的。systemd对Linux启动时间的改进是出于同样的原因:它使必要的系统调用本身而不是生成上千个shell。