如何从Python包中读取(静态)文件?

时间:2023-02-03 18:20:10

Could you tell me how can I read a file that is inside my Python package?

你能告诉我如何读取Python包中的文件?

My situation

A package that I load has a number of templates (text files used as strings) that I want to load from within the program. But how do I specify the path to such file?

我加载的包有许多我想从程序中加载的模板(用作字符串的文本文件)。但是如何指定此类文件的路径?

Imagine I want to read a file from:

想象一下,我想从以下位置读取文件:

package\templates\temp_file

Some kind of path manipulation? Package base path tracking?

某种路径操纵?包基路径跟踪?

7 个解决方案

#1


-6  

[added 2016-06-15: apparently this doesn't work in all situations. please refer to the other answers]

[补充2016-06-15:显然这并不适用于所有情况。请参考其他答案]


import os, mypackage
template = os.path.join(mypackage.__path__[0], 'templates', 'temp_file')

#2


82  

Assuming your template is located inside your module's package at this path:

假设您的模板位于此路径的模块包中:

<your_package>/templates/temp_file

the correct way to read your template is to use pkg_resources package from setuptools distribution:

读取模板的正确方法是使用setuptools发行版中的pkg_resources包:

import pkg_resources

resource_package = __name__  # Could be any module/package name
resource_path = '/'.join(('templates', 'temp_file'))  # Do not use os.path.join(), see below

template = pkg_resources.resource_string(resource_package, resource_path)
# or for a file-like stream:
template = pkg_resources.resource_stream(resource_package, resource_path)

Tip:
This will read data even if your distribution is zipped, so you may set zip_safe=True in your setup.py, and/or use the long-awaited zipapp packer from python-3.5 to create self-contained distributions.

提示:即使您的发行版已压缩,也会读取数据,因此您可以在setup.py中设置zip_safe = True,和/或使用期待已久的python-3.5中的zipapp打包程序来创建自包含的发行版。

According to the Setuptools/pkg_resources docs, do not use os.path.join:

根据Setuptools / pkg_resources文档,不要使用os.path.join:

Basic Resource Access

Note that resource names must be /-separated paths and cannot be absolute (i.e. no leading /) or contain relative names like "..". Do not use os.path routines to manipulate resource paths, as they are not filesystem paths.

请注意,资源名称必须是/ -separated路径,不能是绝对路径(即没有前导/)或包含相对名称,如“..”。不要使用os.path例程来操作资源路径,因为它们不是文件系统路径。

#3


5  

In case you have this structure

如果你有这个结构

lidtk
├── bin
│   └── lidtk
├── lidtk
│   ├── analysis
│   │   ├── char_distribution.py
│   │   └── create_cm.py
│   ├── classifiers
│   │   ├── char_dist_metric_train_test.py
│   │   ├── char_features.py
│   │   ├── cld2
│   │   │   ├── cld2_preds.txt
│   │   │   └── cld2wili.py
│   │   ├── get_cld2.py
│   │   ├── text_cat
│   │   │   ├── __init__.py
│   │   │   ├── REAMDE.md   <---------- say you want to get this
│   │   │   └── textcat_ngram.py
│   │   └── tfidf_features.py
│   ├── data
│   │   ├── __init__.py
│   │   ├── create_ml_dataset.py
│   │   ├── download_documents.py
│   │   ├── language_utils.py
│   │   ├── pickle_to_txt.py
│   │   └── wili.py
│   ├── __init__.py
│   ├── get_predictions.py
│   ├── languages.csv
│   └── utils.py
├── README.md
├── setup.cfg
└── setup.py

you need this code:

你需要这个代码:

import pkg_resources

# __name__ in case you're within the package
# - otherwise it would be 'lidtk' in this example as it is the package name
path = 'classifiers/text_cat/REAMDE.md'  # always use slash
filepath = pkg_resources.resource_filename(__name__, path)

I'm not too sure about the "always use slash" part. It might come from setuptools

我不太确定“总是使用斜线”部分。它可能来自setuptools

Also notice that if you use paths, you must use a forward slash (/) as the path separator, even if you are on Windows. Setuptools automatically converts slashes to appropriate platform-specific separators at build time

另请注意,如果使用路径,则必须使用正斜杠(/)作为路径分隔符,即使您在Windows上也是如此。 Setuptools在构建时自动将斜杠转换为适当的特定于平台的分隔符

In case you wonder where the documentation is:

如果您想知道文档的位置:

#4


2  

Every python module in your package has a __file__ attribute

包中的每个python模块都有一个__file__属性

You can use it as:

您可以将其用作:

import os 
from mypackage

templates_dir = os.path.join(os.path.dirname(mypackage.__file__), 'templates')
template_file = os.path.join(templates_dir, 'template.txt')

For egg resources see: http://peak.telecommunity.com/DevCenter/PythonEggs#accessing-package-resources

有关鸡蛋资源,请参阅:http://peak.telecommunity.com/DevCenter/PythonEggs#accessing-package-resources

#5


0  

assuming you are using an egg file; not extracted:

假设你正在使用鸡蛋文件;未提取:

I "solved" this in a recent project, by using a postinstall script, that extracts my templates from the egg (zip file) to the proper directory in the filesystem. It was the quickest, most reliable solution I found, since working with __path__[0] can go wrong sometimes (i don't recall the name, but i cam across at least one library, that added something in front of that list!).

我在最近的一个项目中通过使用postinstall脚本“解决”了这个问题,该脚本将我的模板从egg(zip文件)中提取到文件系统中的正确目录。这是我发现的最快,最可靠的解决方案,因为使用__path __ [0]有时可能会出错(我不记得这个名字,但我至少看过一个库,在该列表前添加了一些内容!) 。

Also egg files are usually extracted on the fly to a temporary location called the "egg cache". You can change that location using an environment variable, either before starting your script or even later, eg.

鸡蛋文件通常也会被动态提取到称为“蛋缓存”的临时位置。您可以在启动脚本之前或之后使用环境变量更改该位置,例如。

os.environ['PYTHON_EGG_CACHE'] = path

However there is pkg_resources that might do the job properly.

但是有pkg_resources可以正常工作。

#6


-1  

See

看到

Finding a file in a Python module distribution

在Python模块分发中查找文件

#7


-3  

You should be able to import portions of your package's name space with something like:

您应该可以使用以下内容导入部分包名称空间:

from my_package import my_stuff

... you should not need to specify anything that looks like a filename if this is a properly constructed Python package (that's normally abstracted away).

...如果这是一个正确构造的Python包(通常是抽象的),你不应该指定任何看起来像文件名的东西。

#1


-6  

[added 2016-06-15: apparently this doesn't work in all situations. please refer to the other answers]

[补充2016-06-15:显然这并不适用于所有情况。请参考其他答案]


import os, mypackage
template = os.path.join(mypackage.__path__[0], 'templates', 'temp_file')

#2


82  

Assuming your template is located inside your module's package at this path:

假设您的模板位于此路径的模块包中:

<your_package>/templates/temp_file

the correct way to read your template is to use pkg_resources package from setuptools distribution:

读取模板的正确方法是使用setuptools发行版中的pkg_resources包:

import pkg_resources

resource_package = __name__  # Could be any module/package name
resource_path = '/'.join(('templates', 'temp_file'))  # Do not use os.path.join(), see below

template = pkg_resources.resource_string(resource_package, resource_path)
# or for a file-like stream:
template = pkg_resources.resource_stream(resource_package, resource_path)

Tip:
This will read data even if your distribution is zipped, so you may set zip_safe=True in your setup.py, and/or use the long-awaited zipapp packer from python-3.5 to create self-contained distributions.

提示:即使您的发行版已压缩,也会读取数据,因此您可以在setup.py中设置zip_safe = True,和/或使用期待已久的python-3.5中的zipapp打包程序来创建自包含的发行版。

According to the Setuptools/pkg_resources docs, do not use os.path.join:

根据Setuptools / pkg_resources文档,不要使用os.path.join:

Basic Resource Access

Note that resource names must be /-separated paths and cannot be absolute (i.e. no leading /) or contain relative names like "..". Do not use os.path routines to manipulate resource paths, as they are not filesystem paths.

请注意,资源名称必须是/ -separated路径,不能是绝对路径(即没有前导/)或包含相对名称,如“..”。不要使用os.path例程来操作资源路径,因为它们不是文件系统路径。

#3


5  

In case you have this structure

如果你有这个结构

lidtk
├── bin
│   └── lidtk
├── lidtk
│   ├── analysis
│   │   ├── char_distribution.py
│   │   └── create_cm.py
│   ├── classifiers
│   │   ├── char_dist_metric_train_test.py
│   │   ├── char_features.py
│   │   ├── cld2
│   │   │   ├── cld2_preds.txt
│   │   │   └── cld2wili.py
│   │   ├── get_cld2.py
│   │   ├── text_cat
│   │   │   ├── __init__.py
│   │   │   ├── REAMDE.md   <---------- say you want to get this
│   │   │   └── textcat_ngram.py
│   │   └── tfidf_features.py
│   ├── data
│   │   ├── __init__.py
│   │   ├── create_ml_dataset.py
│   │   ├── download_documents.py
│   │   ├── language_utils.py
│   │   ├── pickle_to_txt.py
│   │   └── wili.py
│   ├── __init__.py
│   ├── get_predictions.py
│   ├── languages.csv
│   └── utils.py
├── README.md
├── setup.cfg
└── setup.py

you need this code:

你需要这个代码:

import pkg_resources

# __name__ in case you're within the package
# - otherwise it would be 'lidtk' in this example as it is the package name
path = 'classifiers/text_cat/REAMDE.md'  # always use slash
filepath = pkg_resources.resource_filename(__name__, path)

I'm not too sure about the "always use slash" part. It might come from setuptools

我不太确定“总是使用斜线”部分。它可能来自setuptools

Also notice that if you use paths, you must use a forward slash (/) as the path separator, even if you are on Windows. Setuptools automatically converts slashes to appropriate platform-specific separators at build time

另请注意,如果使用路径,则必须使用正斜杠(/)作为路径分隔符,即使您在Windows上也是如此。 Setuptools在构建时自动将斜杠转换为适当的特定于平台的分隔符

In case you wonder where the documentation is:

如果您想知道文档的位置:

#4


2  

Every python module in your package has a __file__ attribute

包中的每个python模块都有一个__file__属性

You can use it as:

您可以将其用作:

import os 
from mypackage

templates_dir = os.path.join(os.path.dirname(mypackage.__file__), 'templates')
template_file = os.path.join(templates_dir, 'template.txt')

For egg resources see: http://peak.telecommunity.com/DevCenter/PythonEggs#accessing-package-resources

有关鸡蛋资源,请参阅:http://peak.telecommunity.com/DevCenter/PythonEggs#accessing-package-resources

#5


0  

assuming you are using an egg file; not extracted:

假设你正在使用鸡蛋文件;未提取:

I "solved" this in a recent project, by using a postinstall script, that extracts my templates from the egg (zip file) to the proper directory in the filesystem. It was the quickest, most reliable solution I found, since working with __path__[0] can go wrong sometimes (i don't recall the name, but i cam across at least one library, that added something in front of that list!).

我在最近的一个项目中通过使用postinstall脚本“解决”了这个问题,该脚本将我的模板从egg(zip文件)中提取到文件系统中的正确目录。这是我发现的最快,最可靠的解决方案,因为使用__path __ [0]有时可能会出错(我不记得这个名字,但我至少看过一个库,在该列表前添加了一些内容!) 。

Also egg files are usually extracted on the fly to a temporary location called the "egg cache". You can change that location using an environment variable, either before starting your script or even later, eg.

鸡蛋文件通常也会被动态提取到称为“蛋缓存”的临时位置。您可以在启动脚本之前或之后使用环境变量更改该位置,例如。

os.environ['PYTHON_EGG_CACHE'] = path

However there is pkg_resources that might do the job properly.

但是有pkg_resources可以正常工作。

#6


-1  

See

看到

Finding a file in a Python module distribution

在Python模块分发中查找文件

#7


-3  

You should be able to import portions of your package's name space with something like:

您应该可以使用以下内容导入部分包名称空间:

from my_package import my_stuff

... you should not need to specify anything that looks like a filename if this is a properly constructed Python package (that's normally abstracted away).

...如果这是一个正确构造的Python包(通常是抽象的),你不应该指定任何看起来像文件名的东西。