在python中验证yaml文档

时间:2023-01-15 07:44:26

One of the benefits of XML is being able to validate a document against an XSD. YAML doesn't have this feature, so how can I validate that the YAML document I open is in the format expected by my application?

XML的一个好处是能够针对XSD验证文档。 YAML没有此功能,那么如何验证我打开的YAML文档是否符合我的应用程序所需的格式?

9 个解决方案

#1


8  

Try Rx, it has a Python implementation. It works on JSON and YAML.

试试Rx,它有一个Python实现。它适用于JSON和YAML。

From the Rx site:

来自Rx网站:

"When adding an API to your web service, you have to choose how to encode the data you send across the line. XML is one common choice for this, but it can grow arcane and cumbersome pretty quickly. Lots of webservice authors want to avoid thinking about XML, and instead choose formats that provide a few simple data types that correspond to common data structures in modern programming languages. In other words, JSON and YAML.

Unfortunately, while these formats make it easy to pass around complex data structures, they lack a system for validation. XML has XML Schemas and RELAX NG, but these are complicated and sometimes confusing standards. They're not very portable to the kind of data structure provided by JSON, and if you wanted to avoid XML as a data encoding, writing more XML to validate the first XML is probably even less appealing.

不幸的是,虽然这些格式可以轻松传递复杂的数据结构,但它们缺少验证系统。 XML具有XML Schema和RELAX NG,但这些都是复杂的,有时令人困惑的标准。它们对JSON提供的数据结构不是很容易移植,如果你想避免使用XML作为数据编码,那么编写更多的XML来验证第一个XML可能就不那么吸引人了。

Rx is meant to provide a system for data validation that matches up with JSON-style data structures and is as easy to work with as JSON itself."

Rx旨在提供一个数据验证系统,该系统与JSON样式的数据结构相匹配,并且与JSON本身一样易于使用。“

#2


21  

Given that JSON and YAML are pretty similar beasts, you could make use of JSON-Schema to validate a sizable subset of YAML. Here's a code snippet (you'll need PyYAML and jsonschema installed):

鉴于JSON和YAML是非常相似的野兽,您可以使用JSON-Schema来验证YAML的相当大的子集。这是一个代码片段(你需要安装PyYAML和jsonschema):

from jsonschema import validate
import yaml

schema = """
type: object
properties:
  testing:
    type: array
    items:
      enum:
        - this
        - is
        - a
        - test
"""

good_instance = """
testing: ['this', 'is', 'a', 'test']
"""

validate(yaml.load(good_instance), yaml.load(schema)) # passes

# Now let's try a bad instance...

bad_instance = """
testing: ['this', 'is', 'a', 'bad', 'test']
"""

validate(yaml.load(bad_instance), yaml.load(schema))

# Fails with:
# ValidationError: 'bad' is not one of ['this', 'is', 'a', 'test']
#
# Failed validating 'enum' in schema['properties']['testing']['items']:
#     {'enum': ['this', 'is', 'a', 'test']}
#
# On instance['testing'][3]:
#     'bad'

One problem with this is that if your schema spans multiple files and you use "$ref" to reference the other files then those other files will need to be JSON, I think. But there are probably ways around that. In my own project, I'm playing with specifying the schema using JSON files whilst the instances are YAML.

这样做的一个问题是,如果你的架构跨越多个文件而你使用“$ ref”来引用其他文件,那么那些其他文件将需要是JSON,我想。但是可能有办法解决这个问题。在我自己的项目中,我正在使用JSON文件指定模式,而实例是YAML。

#3


6  

Yes - having support for validation is vital for lots of important use cases. See e.g. YAML and the importance of Schema Validation « Stuart Gunter

是的 - 支持验证对于许多重要的用例至关重要。参见例如YAML和模式验证的重要性«Stuart Gunter

As already mentioned, there is Rx, available for various languages, and Kwalify for Ruby and Java.

如前所述,Rx适用于各种语言,Kwalify适用于Ruby和Java。

See also the PyYAML discussion: YAMLSchemaDiscussion.

另见PyYAML讨论:YAMLSchemaDiscussion。

A related effort is JSON Schema, which even had some IETF standardization activity (draft-zyp-json-schema-03 - A JSON Media Type for Describing the Structure and Meaning of JSON Documents)

一个相关的工作是JSON Schema,它甚至有一些IETF标准化活动(draft-zyp-json-schema-03-用于描述JSON文档的结构和含义的JSON媒体类型)

#4


5  

These look good. The yaml parser can handle the syntax erorrs, and one of these libraries can validate the data structures.

这些看起来不错。 yaml解析器可以处理语法错误,其中一个库可以验证数据结构。

#5


3  

I am into same situation. I need to validate the elements of YAML.

我遇到了同样的情况。我需要验证YAML的元素。

First I thought 'PyYAML tags' is the best and simple way. But later decided to go with 'PyKwalify' which actually defines a schema for YAML.

首先我认为'PyYAML标签'是最好和最简单的方法。但后来决定选择“PyKwalify”,它实际上为YAML定义了一个模式。

PyYAML tags:

The YAML file has a tag support where we can enforce this basic checks by prefixing the data type. (e.g) For integer - !!int "123"

YAML文件具有标记支持,我们可以通过为数据类型添加前缀来强制执行此基本检查。 (例如)对于整数 - !! int“123”

More on PyYAML: http://pyyaml.org/wiki/PyYAMLDocumentation#Tags This is good, but if you are going to expose this to the end user, then it might cause confusion. I did some research to define a schema of YAML. The idea is like we can validate the YAML with its corresponding schema for basic data type check. Also even our custom validations like IP address, random strings can be added in this. so we can have our schema separately leaving YAML simple and readable.

有关PyYAML的更多信息:http://pyyaml.org/wiki/PyYAMLDocumentation#Tags这很好,但是如果你要将它暴露给最终用户,那么它可能会引起混淆。我做了一些研究来定义YAML的模式。这个想法就像我们可以使用相应的模式验证YAML以进行基本数据类型检查。甚至我们的自定义验证,如IP地址,随机字符串也可以添加到此。所以我们可以单独使用我们的架构,使YAML简单易读。

I am unable to post more links. Please 'google schema for YAM'L to view the schema discussions.

我无法发布更多链接。请'为YAM'L设置google架构以查看架构讨论。

PyKwalify:

There is a package called PyKwalify which serves this purpose: https://pypi.python.org/pypi/pykwalify

有一个名为PyKwalify的软件包可用于此目的:https://pypi.python.org/pypi/pykwalify

This package best fits my requirements. I tried this with a small example in my local set up, and is working. Heres the sample schema file.

这个包最符合我的要求。我在本地设置中尝试了一个小例子,并且正在工作。下面是示例模式文件。

#sample schema

type: map
mapping:
    Emp:
        type:    map
        mapping:
            name:
                type:      str
                required:  yes
            email:
                type:      str
            age:
                type:      int
            birth:
                type:     str

Valid YAML file for this schema

此架构的有效YAML文件

---
Emp:
    name:   "abc"
    email:  "xyz@gmail.com"
    age:    yy
    birth:  "xx/xx/xxxx"

Thanks

谢谢

#6


3  

I find Cerberus to be very reliable with great documentation and straightforward to use.

我觉得Cerberus非常可靠,文档很好,而且使用起来很简单。

Here is a basic implementation example:

这是一个基本的实现示例:

my_yaml.yaml:

my_yaml.yaml:

name: 'my_name'
date: 2017-10-01
metrics:
  percentage:
    value: 87
    trend: stable

Defining the validation schema in schema.py:

在schema.py中定义验证模式:

{
    'name': {
        'required': True,
        'type': 'string'
    },
    'date': {
        'required': True,
        'type': 'date'
    },
    'metrics': {
        'required': True,
        'type': 'dict',
        'schema': {
            'percentage': {
                'required': True,
                'type': 'dict',
                'schema': {
                    'value': {
                        'required': True,
                        'type': 'number',
                        'min': 0,
                        'max': 100
                    }
                    'trend': {
                        'type': 'string',
                        'nullable': True,
                        'regex': '^(?i)(down|equal|up)$'
                    }
                }
            }
        }
    }
}

Using the PyYaml to load a yaml document:

使用PyYaml加载yaml文档:

def __load_doc():
        with open(__yaml_path, 'r') as stream:
            try:
                return yaml.load(stream)
            except yaml.YAMLError as exception:
                raise exception

Evaluating the yaml file is straightforward:

评估yaml文件非常简单:

schema = eval(open('PATH_TO/schema.py', 'r').read())
        v = Validator(schema)
        doc = __load_doc()
        print v.validate(doc, schema)
        print v.errors

Keep in mind that Cerberus is an agnostic data validation tool, which means that it can support formats other than YAML, such as JSON, XML and so on.

请记住,Cerberus是一种不可知的数据验证工具,这意味着它可以支持除YAML之外的格式,例如JSON,XML等。

#7


0  

I'm not aware of a python solution. But there is a ruby schema validator for YAML called kwalify. You should be able to access it using subprocess if you don't come across a python library.

我不知道python解决方案。但是有一个名为kwalify的YAML的ruby模式验证器。如果您没有遇到python库,您应该能够使用子进程访问它。

#8


0  

You can use python's yaml lib to display message/char/line/file of your loaded file.

您可以使用python的yaml lib来显示已加载文件的message / char / line / file。

#!/usr/bin/env python

import yaml

with open("example.yaml", 'r') as stream:
    try:
        print(yaml.load(stream))
    except yaml.YAMLError as exc:
        print(exc)

The error message can be accessed via exc.problem

可以通过exc.problem访问错误消息

Access exc.problem_mark to get a <yaml.error.Mark> object.

访问exc.problem_mark以获取 对象。

This object allows you to access attributes

此对象允许您访问属性

  • name
  • 名称
  • column
  • line
  • 线

Hence you can create your own pointer to the issue:

因此,您可以创建自己的指针来解决问题:

pm = exc.problem_mark
print("Your file {} has an issue on line {} at position {}".format(pm.name, pm.line, pm.column))

#9


0  

I wrapped some existing json-related python libraries aiming for being able to use them with yaml as well.

我包装了一些现有的与json相关的python库,旨在能够将它们与yaml一起使用。

The resulting python library mainly wraps ...

由此产生的python库主要包装...

  • jsonschema - a validator for json files against json-schema files, being wrapped to support validating yaml files against json-schema files in yaml-format as well.

    jsonschema - 针对json-schema文件的json文件的验证器,被包装以支持以yaml格式对json-schema文件验证yaml文件。

  • jsonpath-ng - an implementation of JSONPath for python, being wrapped to support JSONPath selection directly on yaml files.

    jsonpath-ng - 用于python的JSONPath实现,被包装以直接在yaml文件上支持JSONPath选择。

... and is available on github:

...并且可以在github上找到:

https://github.com/yaccob/ytools

https://github.com/yaccob/ytools

It can be installed using pip:

它可以使用pip安装:

pip install ytools

pip安装ytools

Validation example (from https://github.com/yaccob/ytools#validation):

验证示例(来自https://github.com/yaccob/ytools#validation):

import ytools
ytools.validate("test/sampleschema.yaml", ["test/sampledata.yaml"])

What you don't get out of the box yet, is validating against external schemas that are in yaml format as well.

你还没有开箱即用,也在验证yaml格式的外部架构。

ytools is not providing anything that hasn't existed before - it just makes the application of some existing solutions more flexible and more convenient.

ytools没有提供以前不存在的任何东西 - 它只是使一些现有解决方案的应用更加灵活和方便。

#1


8  

Try Rx, it has a Python implementation. It works on JSON and YAML.

试试Rx,它有一个Python实现。它适用于JSON和YAML。

From the Rx site:

来自Rx网站:

"When adding an API to your web service, you have to choose how to encode the data you send across the line. XML is one common choice for this, but it can grow arcane and cumbersome pretty quickly. Lots of webservice authors want to avoid thinking about XML, and instead choose formats that provide a few simple data types that correspond to common data structures in modern programming languages. In other words, JSON and YAML.

Unfortunately, while these formats make it easy to pass around complex data structures, they lack a system for validation. XML has XML Schemas and RELAX NG, but these are complicated and sometimes confusing standards. They're not very portable to the kind of data structure provided by JSON, and if you wanted to avoid XML as a data encoding, writing more XML to validate the first XML is probably even less appealing.

不幸的是,虽然这些格式可以轻松传递复杂的数据结构,但它们缺少验证系统。 XML具有XML Schema和RELAX NG,但这些都是复杂的,有时令人困惑的标准。它们对JSON提供的数据结构不是很容易移植,如果你想避免使用XML作为数据编码,那么编写更多的XML来验证第一个XML可能就不那么吸引人了。

Rx is meant to provide a system for data validation that matches up with JSON-style data structures and is as easy to work with as JSON itself."

Rx旨在提供一个数据验证系统,该系统与JSON样式的数据结构相匹配,并且与JSON本身一样易于使用。“

#2


21  

Given that JSON and YAML are pretty similar beasts, you could make use of JSON-Schema to validate a sizable subset of YAML. Here's a code snippet (you'll need PyYAML and jsonschema installed):

鉴于JSON和YAML是非常相似的野兽,您可以使用JSON-Schema来验证YAML的相当大的子集。这是一个代码片段(你需要安装PyYAML和jsonschema):

from jsonschema import validate
import yaml

schema = """
type: object
properties:
  testing:
    type: array
    items:
      enum:
        - this
        - is
        - a
        - test
"""

good_instance = """
testing: ['this', 'is', 'a', 'test']
"""

validate(yaml.load(good_instance), yaml.load(schema)) # passes

# Now let's try a bad instance...

bad_instance = """
testing: ['this', 'is', 'a', 'bad', 'test']
"""

validate(yaml.load(bad_instance), yaml.load(schema))

# Fails with:
# ValidationError: 'bad' is not one of ['this', 'is', 'a', 'test']
#
# Failed validating 'enum' in schema['properties']['testing']['items']:
#     {'enum': ['this', 'is', 'a', 'test']}
#
# On instance['testing'][3]:
#     'bad'

One problem with this is that if your schema spans multiple files and you use "$ref" to reference the other files then those other files will need to be JSON, I think. But there are probably ways around that. In my own project, I'm playing with specifying the schema using JSON files whilst the instances are YAML.

这样做的一个问题是,如果你的架构跨越多个文件而你使用“$ ref”来引用其他文件,那么那些其他文件将需要是JSON,我想。但是可能有办法解决这个问题。在我自己的项目中,我正在使用JSON文件指定模式,而实例是YAML。

#3


6  

Yes - having support for validation is vital for lots of important use cases. See e.g. YAML and the importance of Schema Validation « Stuart Gunter

是的 - 支持验证对于许多重要的用例至关重要。参见例如YAML和模式验证的重要性«Stuart Gunter

As already mentioned, there is Rx, available for various languages, and Kwalify for Ruby and Java.

如前所述,Rx适用于各种语言,Kwalify适用于Ruby和Java。

See also the PyYAML discussion: YAMLSchemaDiscussion.

另见PyYAML讨论:YAMLSchemaDiscussion。

A related effort is JSON Schema, which even had some IETF standardization activity (draft-zyp-json-schema-03 - A JSON Media Type for Describing the Structure and Meaning of JSON Documents)

一个相关的工作是JSON Schema,它甚至有一些IETF标准化活动(draft-zyp-json-schema-03-用于描述JSON文档的结构和含义的JSON媒体类型)

#4


5  

These look good. The yaml parser can handle the syntax erorrs, and one of these libraries can validate the data structures.

这些看起来不错。 yaml解析器可以处理语法错误,其中一个库可以验证数据结构。

#5


3  

I am into same situation. I need to validate the elements of YAML.

我遇到了同样的情况。我需要验证YAML的元素。

First I thought 'PyYAML tags' is the best and simple way. But later decided to go with 'PyKwalify' which actually defines a schema for YAML.

首先我认为'PyYAML标签'是最好和最简单的方法。但后来决定选择“PyKwalify”,它实际上为YAML定义了一个模式。

PyYAML tags:

The YAML file has a tag support where we can enforce this basic checks by prefixing the data type. (e.g) For integer - !!int "123"

YAML文件具有标记支持,我们可以通过为数据类型添加前缀来强制执行此基本检查。 (例如)对于整数 - !! int“123”

More on PyYAML: http://pyyaml.org/wiki/PyYAMLDocumentation#Tags This is good, but if you are going to expose this to the end user, then it might cause confusion. I did some research to define a schema of YAML. The idea is like we can validate the YAML with its corresponding schema for basic data type check. Also even our custom validations like IP address, random strings can be added in this. so we can have our schema separately leaving YAML simple and readable.

有关PyYAML的更多信息:http://pyyaml.org/wiki/PyYAMLDocumentation#Tags这很好,但是如果你要将它暴露给最终用户,那么它可能会引起混淆。我做了一些研究来定义YAML的模式。这个想法就像我们可以使用相应的模式验证YAML以进行基本数据类型检查。甚至我们的自定义验证,如IP地址,随机字符串也可以添加到此。所以我们可以单独使用我们的架构,使YAML简单易读。

I am unable to post more links. Please 'google schema for YAM'L to view the schema discussions.

我无法发布更多链接。请'为YAM'L设置google架构以查看架构讨论。

PyKwalify:

There is a package called PyKwalify which serves this purpose: https://pypi.python.org/pypi/pykwalify

有一个名为PyKwalify的软件包可用于此目的:https://pypi.python.org/pypi/pykwalify

This package best fits my requirements. I tried this with a small example in my local set up, and is working. Heres the sample schema file.

这个包最符合我的要求。我在本地设置中尝试了一个小例子,并且正在工作。下面是示例模式文件。

#sample schema

type: map
mapping:
    Emp:
        type:    map
        mapping:
            name:
                type:      str
                required:  yes
            email:
                type:      str
            age:
                type:      int
            birth:
                type:     str

Valid YAML file for this schema

此架构的有效YAML文件

---
Emp:
    name:   "abc"
    email:  "xyz@gmail.com"
    age:    yy
    birth:  "xx/xx/xxxx"

Thanks

谢谢

#6


3  

I find Cerberus to be very reliable with great documentation and straightforward to use.

我觉得Cerberus非常可靠,文档很好,而且使用起来很简单。

Here is a basic implementation example:

这是一个基本的实现示例:

my_yaml.yaml:

my_yaml.yaml:

name: 'my_name'
date: 2017-10-01
metrics:
  percentage:
    value: 87
    trend: stable

Defining the validation schema in schema.py:

在schema.py中定义验证模式:

{
    'name': {
        'required': True,
        'type': 'string'
    },
    'date': {
        'required': True,
        'type': 'date'
    },
    'metrics': {
        'required': True,
        'type': 'dict',
        'schema': {
            'percentage': {
                'required': True,
                'type': 'dict',
                'schema': {
                    'value': {
                        'required': True,
                        'type': 'number',
                        'min': 0,
                        'max': 100
                    }
                    'trend': {
                        'type': 'string',
                        'nullable': True,
                        'regex': '^(?i)(down|equal|up)$'
                    }
                }
            }
        }
    }
}

Using the PyYaml to load a yaml document:

使用PyYaml加载yaml文档:

def __load_doc():
        with open(__yaml_path, 'r') as stream:
            try:
                return yaml.load(stream)
            except yaml.YAMLError as exception:
                raise exception

Evaluating the yaml file is straightforward:

评估yaml文件非常简单:

schema = eval(open('PATH_TO/schema.py', 'r').read())
        v = Validator(schema)
        doc = __load_doc()
        print v.validate(doc, schema)
        print v.errors

Keep in mind that Cerberus is an agnostic data validation tool, which means that it can support formats other than YAML, such as JSON, XML and so on.

请记住,Cerberus是一种不可知的数据验证工具,这意味着它可以支持除YAML之外的格式,例如JSON,XML等。

#7


0  

I'm not aware of a python solution. But there is a ruby schema validator for YAML called kwalify. You should be able to access it using subprocess if you don't come across a python library.

我不知道python解决方案。但是有一个名为kwalify的YAML的ruby模式验证器。如果您没有遇到python库,您应该能够使用子进程访问它。

#8


0  

You can use python's yaml lib to display message/char/line/file of your loaded file.

您可以使用python的yaml lib来显示已加载文件的message / char / line / file。

#!/usr/bin/env python

import yaml

with open("example.yaml", 'r') as stream:
    try:
        print(yaml.load(stream))
    except yaml.YAMLError as exc:
        print(exc)

The error message can be accessed via exc.problem

可以通过exc.problem访问错误消息

Access exc.problem_mark to get a <yaml.error.Mark> object.

访问exc.problem_mark以获取 对象。

This object allows you to access attributes

此对象允许您访问属性

  • name
  • 名称
  • column
  • line
  • 线

Hence you can create your own pointer to the issue:

因此,您可以创建自己的指针来解决问题:

pm = exc.problem_mark
print("Your file {} has an issue on line {} at position {}".format(pm.name, pm.line, pm.column))

#9


0  

I wrapped some existing json-related python libraries aiming for being able to use them with yaml as well.

我包装了一些现有的与json相关的python库,旨在能够将它们与yaml一起使用。

The resulting python library mainly wraps ...

由此产生的python库主要包装...

  • jsonschema - a validator for json files against json-schema files, being wrapped to support validating yaml files against json-schema files in yaml-format as well.

    jsonschema - 针对json-schema文件的json文件的验证器,被包装以支持以yaml格式对json-schema文件验证yaml文件。

  • jsonpath-ng - an implementation of JSONPath for python, being wrapped to support JSONPath selection directly on yaml files.

    jsonpath-ng - 用于python的JSONPath实现,被包装以直接在yaml文件上支持JSONPath选择。

... and is available on github:

...并且可以在github上找到:

https://github.com/yaccob/ytools

https://github.com/yaccob/ytools

It can be installed using pip:

它可以使用pip安装:

pip install ytools

pip安装ytools

Validation example (from https://github.com/yaccob/ytools#validation):

验证示例(来自https://github.com/yaccob/ytools#validation):

import ytools
ytools.validate("test/sampleschema.yaml", ["test/sampledata.yaml"])

What you don't get out of the box yet, is validating against external schemas that are in yaml format as well.

你还没有开箱即用,也在验证yaml格式的外部架构。

ytools is not providing anything that hasn't existed before - it just makes the application of some existing solutions more flexible and more convenient.

ytools没有提供以前不存在的任何东西 - 它只是使一些现有解决方案的应用更加灵活和方便。