我应该使用YAML还是JSON存储Perl数据?

时间:2023-01-15 09:27:18

I've been using the YAML format with reasonable success in the last 6 months or so.

在过去的6个月中,我一直在使用YAML格式,并取得了一定的成功。

However, the pure Perl implementation of the YAML parser is fairly fidgety to hand-write a readable file for and has (in my opinion) annoying quirks such as requiring a newline at end of the file. It's also gigantically slow compared to the rest of my program.

然而,YAML解析器的纯Perl实现在手工编写可读文件时是相当麻烦的,并且(在我看来)有一些恼人的怪癖,比如需要在文件末尾添加换行符。与我的其他程序相比,它的速度也非常慢。

I'm pondering the next evolution of my project, and I'm considering using JSON instead (a mostly strict subset of YAML, as it turns out). But which format has the most community traction and effort in Perl?

我正在考虑我的项目的下一个发展,我正在考虑使用JSON(实际上是YAML中最严格的子集)。但是哪种格式在Perl中具有最多的社区吸引力和工作量?

Which appears today to be the better long-term format for simple data description in Perl, YAML or JSON, and why?

对于Perl、YAML或JSON中的简单数据描述来说,哪种格式看起来是更好的长期格式,为什么呢?

7 个解决方案

#1


79  

YAML vs JSON is something very much not settled in Perl, and I will admit I tend to be in the middle of that. I would advice that either is going to get you about as much community traction. I'd make the decision based on the various pros and cons of the formats. I break down the various data serializing options like so (I'm going to community wiki this so people can add to it):

在Perl中,YAML vs . JSON是非常不稳定的东西,我承认我倾向于这样做。我的建议是任何一种都能让你获得同样多的社区牵引力。我会根据这些格式的利弊做出决定。我将各种数据序列化选项分解为so(我将加入社区wiki,这样人们就可以添加):

YAML Pros

YAML优点

  • Human friendly, people write basic YAML without even knowing it
  • 人类友好,人们写基本的YAML甚至不知道它
  • WYSIWYG strings
  • 所见即所得的字符串
  • Expressive (it has the TMTOWDI nature)
  • 表达性(具有TMTOWDI性质)
  • Expandable type/metadata system
  • 可扩展的类型/元数据系统
  • Perl compatible data types
  • Perl兼容的数据类型
  • Portable
  • 可移植的
  • Familiar (a lot of the inline and string syntax looks like Perl code)
  • 熟悉(许多内联和字符串语法看起来像Perl代码)
  • Good implementations if you have a compiler (YAML::XS)
  • 如果您有一个编译器(YAML::XS),那么这是一个很好的实现。
  • Good ability to dump Perl data
  • 良好的Perl数据转储能力
  • Compact use of screen space (possible, you can format to fit in one line)
  • 紧凑使用屏幕空间(可能,您可以格式化以适应一行)

YAML Cons

YAML缺点

  • Large spec
  • 大规格
  • Unreliable/incomplete pure Perl implementations
  • 不可靠的/不完整的纯Perl实现
  • Whitespace as syntax can be contentious.
  • 空格作为语法是有争议的。

JSON Pros

JSON支持

  • Human readable/writable
  • 人类可读/可写的
  • Small spec
  • 小规格
  • Good implementations
  • 很好的实现
  • Portable
  • 可移植的
  • Perlish syntax
  • Perlish语法
  • YAML 1.2 is a superset of JSON
  • YAML 1.2是JSON的超集
  • Compact use of screen space
  • 紧凑的屏幕空间使用。
  • Perl friendly data types
  • Perl友好的数据类型
  • Lots of things handle JSON
  • 很多东西都处理JSON。

JSON Cons

JSON缺点

  • Strings are not WYSIWYG
  • 字符串没有所见即所得
  • No expandability
  • 没有可扩展性
  • Some Perl structures have to be expressed ad-hoc (objects & globs)
  • 有些Perl结构必须特别表达(对象和全局变量)
  • Lack of expressibility
  • 缺乏替代

XML Pros

XML优点

  • Widespread use
  • 广泛使用
  • Syntax familiar to web developers
  • web开发人员熟悉的语法。
  • Large corpus of good XML modules
  • 大量优秀的XML模块
  • Schemas
  • 模式
  • Technologies to search and transform the data
  • 搜索和转换数据的技术
  • Portable
  • 可移植的

XML Cons

XML缺点

  • Tedious for humans to read and write
  • 人类读和写都很乏味
  • Data structures foreign to Perl
  • Perl之外的数据结构
  • Lack of expressibility
  • 缺乏替代
  • Large spec
  • 大规格
  • Verbose
  • 详细的

Perl/Data::Dumper Pros

Perl / Data::Dumper优点

  • No dependencies
  • 没有依赖关系
  • Surprisingly compact (with the right flags)
  • 惊人的紧凑(带有正确的标志)
  • Perl friendly
  • Perl友好
  • Can dump pretty much anything (via DDS)
  • 可以转储几乎任何东西(通过DDS)
  • Expressive
  • 富有表现力的
  • Compact use of screen space
  • 紧凑的屏幕空间使用。
  • WYSIWYG strings
  • 所见即所得的字符串
  • Familiar
  • 熟悉

Perl/Data::Dumper Cons

Perl / Data::Dumper缺点

  • Non-portable (to other languages)
  • 不可移植的(其他语言)
  • Insecure (without heroic measures)
  • 不安全的(没有英勇的措施)
  • Inscrutable to non-Perl programmers
  • 神秘的对非perl程序设计人员

Storable Pros

耐贮藏的优点

  • Compact? (don't have numbers to back it up)
  • 紧凑的吗?(没有数据支持)
  • Fast? (don't have numbers to back it up)
  • 快?(没有数据支持)

Storable Cons

耐贮藏的缺点

  • Human hostile
  • 人类的敌意
  • Incompatible across Storable versions
  • 跨存储版本不兼容
  • Non-portable (to other languages)
  • 不可移植的(其他语言)

#2


13  

As with most things, it depends. I think if you want speed and interoperability (with other languages), use JSON, in particular JSON::XS.

和大多数事情一样,这要看情况。我认为如果您想要速度和互操作性(与其他语言),请使用JSON,特别是JSON: XS。

If you want something that's only ever going to be used by Perl modules, stick with YAML. It's much more common to find Perl modules on CPAN that support data description with YAML, or which depend on YAML, than JSON.

如果您想要Perl模块只使用的东西,请使用YAML。在CPAN上找到支持使用YAML(或者依赖YAML)进行数据描述的Perl模块比使用JSON更常见。

Note that I am not an authority and this opinion is based largely on hunch and conjecture. In particular, I have not profiled JSON::XS vs. YAML::XS. If I am offensively ignorant, I can only hope I will make someone irate enough to bring useful information to the discussion by correcting me.

请注意,我不是权威,这种观点主要基于直觉和猜想。特别是,我还没有分析JSON:::XS和YAML:::XS。如果我冒犯地无知,我只能希望我能让某人足够愤怒,通过纠正我来为讨论带来有用的信息。

#3


7  

It's all about human-readability, if this is your main concern choose YAML:

这都是关于人的可读性,如果这是你的主要关注点选择YAML:

YAML:

YAML:

american:
  - Boston Red Sox
  - Detroit Tigers
  - New York Yankees
national:
  - New York Mets
  - Chicago Cubs
  - Atlanta Braves

JSON:

JSON:

{
  "american": [
    "Boston Red Sox", 
    "Detroit Tigers", 
    "New York Yankees"
  ], 
  "national": [
    "New York Mets", 
    "Chicago Cubs", 
    "Atlanta Braves"
  ]
}

#4


4  

The pure-Perl YAML implementation (YAML module as opposed to YAML::Syck) seems to have some serious problems. I recently ran into issues where it could not process YAML documents with very long lines (32k characters or so).

纯perl YAML实现(与YAML::Syck相反的是YAML模块)似乎存在一些严重的问题。我最近遇到了一些问题,它不能处理带有很长的行(32k字符左右)的YAML文档。

YAML is able to store and load blessed variables and does so by default (The snippet below was copied from a *sepia-repl* buffer in Emacs):

YAML能够存储和加载受保护的变量,并在默认情况下这样做(下面的代码片段是从Emacs中的*sepia-repl*缓冲区复制的):

I need user feedback!  Please send questions or comments to seano@cpan.org.
Sepia version 0.98.
Type ",h" for help, or ",q" to quit.
main @> use YAML
undef
main @> $foo = bless {}, 'asdf'
bless( {}, 'asdf' )
main @> $foo_dump = YAML::Dump $foo
'--- !!perl/hash:asdf {}
'
main @> YAML::Load $foo_dump
bless( {}, 'asdf' )

This is quite scary security-wise because untrusted data can be used to call any DESTROY method that has been defined in your application -- or any of the modules it uses.

这是相当可怕的安全问题,因为不可信的数据可以用来调用在应用程序中定义的任何销毁方法——或者它使用的任何模块。

The following short program demonstrates the problem:

下面这个简短的程序演示了这个问题:

#!/usr/bin/perl
use YAML;
use Data::Dumper;
package My::Namespace;
sub DESTROY {
    print Data::Dumper::Dumper \@_;
}
package main;
my $var = YAML::Load '--- !!perl/hash:My::Namespace
bar: 2
foo: 1
';

JSON does not allow this by default -- it is possible to serialize Perl "objects", but in order to do that, you have to define TO_JSON methods.

JSON默认不允许这样做——序列化Perl“对象”是可能的,但是为了实现这一点,您必须定义TO_JSON方法。

#5


1  

if you are considering JavaScript Object Notation, why not use "Perl Object Notation"?

如果您正在考虑JavaScript对象表示法,为什么不使用“Perl对象表示法”呢?

JSON:

JSON:

{"name": "bob", "parents": {"mother": "susan", "father": "bill"}, "nums": [1, 2, 3]}

Perl:

Perl:

{name => "bob", parents => {mother => "susan", father => "bill"}, nums => [1, 2, 3]}

#6


0  

You might also want to consider using Storable. You will likely get a very good speed boost with it. The trade-offs are:

您可能还想考虑使用可存储。你可能会得到一个非常好的速度提升与它。权衡:

  • the Storable format is binary and not human readable like JSON or YAML
  • 可存储格式是二进制的,不像JSON或YAML那样是人类可读的
  • Storable is not a pure Perl module (if that matters)
  • 可存储并不是一个纯粹的Perl模块(如果这很重要的话)

#7


0  

I use YAML for tracking status of processes because I can read YML in the middle of the process. You (technically) need fully formed documents to read XML or JS. YAML is nice for tracking status because you can write lots of mini docs to a file. Otherwise, I usually go with XML or JS. Nice summary of pros & cons above, btw.

我使用YAML跟踪进程的状态,因为我可以在进程中间读取YML。您(技术上)需要完整的文档来读取XML或JS。YAML对于跟踪状态很好,因为您可以将许多迷你文档写入文件。否则,我通常使用XML或JS。顺便说一句,这是对以上利弊的很好的总结。

#1


79  

YAML vs JSON is something very much not settled in Perl, and I will admit I tend to be in the middle of that. I would advice that either is going to get you about as much community traction. I'd make the decision based on the various pros and cons of the formats. I break down the various data serializing options like so (I'm going to community wiki this so people can add to it):

在Perl中,YAML vs . JSON是非常不稳定的东西,我承认我倾向于这样做。我的建议是任何一种都能让你获得同样多的社区牵引力。我会根据这些格式的利弊做出决定。我将各种数据序列化选项分解为so(我将加入社区wiki,这样人们就可以添加):

YAML Pros

YAML优点

  • Human friendly, people write basic YAML without even knowing it
  • 人类友好,人们写基本的YAML甚至不知道它
  • WYSIWYG strings
  • 所见即所得的字符串
  • Expressive (it has the TMTOWDI nature)
  • 表达性(具有TMTOWDI性质)
  • Expandable type/metadata system
  • 可扩展的类型/元数据系统
  • Perl compatible data types
  • Perl兼容的数据类型
  • Portable
  • 可移植的
  • Familiar (a lot of the inline and string syntax looks like Perl code)
  • 熟悉(许多内联和字符串语法看起来像Perl代码)
  • Good implementations if you have a compiler (YAML::XS)
  • 如果您有一个编译器(YAML::XS),那么这是一个很好的实现。
  • Good ability to dump Perl data
  • 良好的Perl数据转储能力
  • Compact use of screen space (possible, you can format to fit in one line)
  • 紧凑使用屏幕空间(可能,您可以格式化以适应一行)

YAML Cons

YAML缺点

  • Large spec
  • 大规格
  • Unreliable/incomplete pure Perl implementations
  • 不可靠的/不完整的纯Perl实现
  • Whitespace as syntax can be contentious.
  • 空格作为语法是有争议的。

JSON Pros

JSON支持

  • Human readable/writable
  • 人类可读/可写的
  • Small spec
  • 小规格
  • Good implementations
  • 很好的实现
  • Portable
  • 可移植的
  • Perlish syntax
  • Perlish语法
  • YAML 1.2 is a superset of JSON
  • YAML 1.2是JSON的超集
  • Compact use of screen space
  • 紧凑的屏幕空间使用。
  • Perl friendly data types
  • Perl友好的数据类型
  • Lots of things handle JSON
  • 很多东西都处理JSON。

JSON Cons

JSON缺点

  • Strings are not WYSIWYG
  • 字符串没有所见即所得
  • No expandability
  • 没有可扩展性
  • Some Perl structures have to be expressed ad-hoc (objects & globs)
  • 有些Perl结构必须特别表达(对象和全局变量)
  • Lack of expressibility
  • 缺乏替代

XML Pros

XML优点

  • Widespread use
  • 广泛使用
  • Syntax familiar to web developers
  • web开发人员熟悉的语法。
  • Large corpus of good XML modules
  • 大量优秀的XML模块
  • Schemas
  • 模式
  • Technologies to search and transform the data
  • 搜索和转换数据的技术
  • Portable
  • 可移植的

XML Cons

XML缺点

  • Tedious for humans to read and write
  • 人类读和写都很乏味
  • Data structures foreign to Perl
  • Perl之外的数据结构
  • Lack of expressibility
  • 缺乏替代
  • Large spec
  • 大规格
  • Verbose
  • 详细的

Perl/Data::Dumper Pros

Perl / Data::Dumper优点

  • No dependencies
  • 没有依赖关系
  • Surprisingly compact (with the right flags)
  • 惊人的紧凑(带有正确的标志)
  • Perl friendly
  • Perl友好
  • Can dump pretty much anything (via DDS)
  • 可以转储几乎任何东西(通过DDS)
  • Expressive
  • 富有表现力的
  • Compact use of screen space
  • 紧凑的屏幕空间使用。
  • WYSIWYG strings
  • 所见即所得的字符串
  • Familiar
  • 熟悉

Perl/Data::Dumper Cons

Perl / Data::Dumper缺点

  • Non-portable (to other languages)
  • 不可移植的(其他语言)
  • Insecure (without heroic measures)
  • 不安全的(没有英勇的措施)
  • Inscrutable to non-Perl programmers
  • 神秘的对非perl程序设计人员

Storable Pros

耐贮藏的优点

  • Compact? (don't have numbers to back it up)
  • 紧凑的吗?(没有数据支持)
  • Fast? (don't have numbers to back it up)
  • 快?(没有数据支持)

Storable Cons

耐贮藏的缺点

  • Human hostile
  • 人类的敌意
  • Incompatible across Storable versions
  • 跨存储版本不兼容
  • Non-portable (to other languages)
  • 不可移植的(其他语言)

#2


13  

As with most things, it depends. I think if you want speed and interoperability (with other languages), use JSON, in particular JSON::XS.

和大多数事情一样,这要看情况。我认为如果您想要速度和互操作性(与其他语言),请使用JSON,特别是JSON: XS。

If you want something that's only ever going to be used by Perl modules, stick with YAML. It's much more common to find Perl modules on CPAN that support data description with YAML, or which depend on YAML, than JSON.

如果您想要Perl模块只使用的东西,请使用YAML。在CPAN上找到支持使用YAML(或者依赖YAML)进行数据描述的Perl模块比使用JSON更常见。

Note that I am not an authority and this opinion is based largely on hunch and conjecture. In particular, I have not profiled JSON::XS vs. YAML::XS. If I am offensively ignorant, I can only hope I will make someone irate enough to bring useful information to the discussion by correcting me.

请注意,我不是权威,这种观点主要基于直觉和猜想。特别是,我还没有分析JSON:::XS和YAML:::XS。如果我冒犯地无知,我只能希望我能让某人足够愤怒,通过纠正我来为讨论带来有用的信息。

#3


7  

It's all about human-readability, if this is your main concern choose YAML:

这都是关于人的可读性,如果这是你的主要关注点选择YAML:

YAML:

YAML:

american:
  - Boston Red Sox
  - Detroit Tigers
  - New York Yankees
national:
  - New York Mets
  - Chicago Cubs
  - Atlanta Braves

JSON:

JSON:

{
  "american": [
    "Boston Red Sox", 
    "Detroit Tigers", 
    "New York Yankees"
  ], 
  "national": [
    "New York Mets", 
    "Chicago Cubs", 
    "Atlanta Braves"
  ]
}

#4


4  

The pure-Perl YAML implementation (YAML module as opposed to YAML::Syck) seems to have some serious problems. I recently ran into issues where it could not process YAML documents with very long lines (32k characters or so).

纯perl YAML实现(与YAML::Syck相反的是YAML模块)似乎存在一些严重的问题。我最近遇到了一些问题,它不能处理带有很长的行(32k字符左右)的YAML文档。

YAML is able to store and load blessed variables and does so by default (The snippet below was copied from a *sepia-repl* buffer in Emacs):

YAML能够存储和加载受保护的变量,并在默认情况下这样做(下面的代码片段是从Emacs中的*sepia-repl*缓冲区复制的):

I need user feedback!  Please send questions or comments to seano@cpan.org.
Sepia version 0.98.
Type ",h" for help, or ",q" to quit.
main @> use YAML
undef
main @> $foo = bless {}, 'asdf'
bless( {}, 'asdf' )
main @> $foo_dump = YAML::Dump $foo
'--- !!perl/hash:asdf {}
'
main @> YAML::Load $foo_dump
bless( {}, 'asdf' )

This is quite scary security-wise because untrusted data can be used to call any DESTROY method that has been defined in your application -- or any of the modules it uses.

这是相当可怕的安全问题,因为不可信的数据可以用来调用在应用程序中定义的任何销毁方法——或者它使用的任何模块。

The following short program demonstrates the problem:

下面这个简短的程序演示了这个问题:

#!/usr/bin/perl
use YAML;
use Data::Dumper;
package My::Namespace;
sub DESTROY {
    print Data::Dumper::Dumper \@_;
}
package main;
my $var = YAML::Load '--- !!perl/hash:My::Namespace
bar: 2
foo: 1
';

JSON does not allow this by default -- it is possible to serialize Perl "objects", but in order to do that, you have to define TO_JSON methods.

JSON默认不允许这样做——序列化Perl“对象”是可能的,但是为了实现这一点,您必须定义TO_JSON方法。

#5


1  

if you are considering JavaScript Object Notation, why not use "Perl Object Notation"?

如果您正在考虑JavaScript对象表示法,为什么不使用“Perl对象表示法”呢?

JSON:

JSON:

{"name": "bob", "parents": {"mother": "susan", "father": "bill"}, "nums": [1, 2, 3]}

Perl:

Perl:

{name => "bob", parents => {mother => "susan", father => "bill"}, nums => [1, 2, 3]}

#6


0  

You might also want to consider using Storable. You will likely get a very good speed boost with it. The trade-offs are:

您可能还想考虑使用可存储。你可能会得到一个非常好的速度提升与它。权衡:

  • the Storable format is binary and not human readable like JSON or YAML
  • 可存储格式是二进制的,不像JSON或YAML那样是人类可读的
  • Storable is not a pure Perl module (if that matters)
  • 可存储并不是一个纯粹的Perl模块(如果这很重要的话)

#7


0  

I use YAML for tracking status of processes because I can read YML in the middle of the process. You (technically) need fully formed documents to read XML or JS. YAML is nice for tracking status because you can write lots of mini docs to a file. Otherwise, I usually go with XML or JS. Nice summary of pros & cons above, btw.

我使用YAML跟踪进程的状态,因为我可以在进程中间读取YML。您(技术上)需要完整的文档来读取XML或JS。YAML对于跟踪状态很好,因为您可以将许多迷你文档写入文件。否则,我通常使用XML或JS。顺便说一句,这是对以上利弊的很好的总结。