使用Python中的多个JSON对象加载和解析JSON文件

时间:2022-09-15 11:49:12

I am trying to load and parse a JSON file in Python. But I'm stuck trying to load the file:

我试图在Python中加载和解析JSON文件。但我一直试图加载文件:

import jsonjson_data = open('file')data = json.load(json_data)

Yields:

ValueError: Extra data: line 2 column 1 - line 225116 column 1 (char 232 - 160128774)

I looked at 18.2. json — JSON encoder and decoder in the Python documentation, but it's pretty discouraging to read through this horrible-looking documentation.

我看了18.2。 json - Python文档中的JSON编码器和解码器,但阅读这些看起来很糟糕的文档非常令人沮丧。

3 个解决方案

#1


168  

You have a JSON Lines format text file. You need to parse your file line by line:

您有一个JSON Lines格式的文本文件。您需要逐行解析文件:

import jsondata = []with open('file') as f:    for line in f:        data.append(json.loads(line))

Each line contains valid JSON, but as a whole, it is not a valid JSON value as there is no top-level list or object definition.

每行包含有效的JSON,但作为一个整体,它不是有效的JSON值,因为没有*列表或对象定义。

Note that because the file contains JSON per line, you are saved the headaches of trying to parse it all in one go or to figure out a streaming JSON parser. You can now opt to process each line separately before moving on to the next, saving memory in the process. You probably don't want to append each result to one list and then process everything if your file is really big.

请注意,因为该文件每行包含JSON,所以您可以省去尝试一次性解析它或者找出流式JSON解析器的麻烦。您现在可以选择单独处理每一行,然后再继续下一步,从而节省内存。您可能不希望将每个结果附加到一个列表中,然后在文件非常大的情况下处理所有内容。

If you have a file containing individual JSON objects with delimiters in-between, use How do I use the 'json' module to read in one JSON object at a time? to parse out individual objects using a buffered method.

如果您有一个包含单个JSON对象的文件,其间有分隔符,请使用如何使用'json'模块一次读入一个JSON对象?使用缓冲方法解析单个对象。

#2


5  

That is ill-formatted. You have one JSON object per line, but they are not contained in a larger data structure (ie an array). You'll either need to reformat it so that it begins with [ and ends with ] with a comma at the end of each line, or parse it line by line as separate dictionaries.

这是格式不合理的。每行有一个JSON对象,但它们不包含在更大的数据结构(即数组)中。您需要重新格式化它以使其以[并以]结尾处以每行末尾的逗号开头,或者将其逐行解析为单独的词典。

#3


2  

for those stumbling upon this question: the python jsonlines library (much younger than this question) elegantly. handles files with one json document per line. see https://jsonlines.readthedocs.io/

对于那些在这个问题上磕磕绊绊的人:python jsonlines图书馆(比这个问题要年轻得多)优雅。每行处理一个json文档的文件。见https://jsonlines.readthedocs.io/

#1


168  

You have a JSON Lines format text file. You need to parse your file line by line:

您有一个JSON Lines格式的文本文件。您需要逐行解析文件:

import jsondata = []with open('file') as f:    for line in f:        data.append(json.loads(line))

Each line contains valid JSON, but as a whole, it is not a valid JSON value as there is no top-level list or object definition.

每行包含有效的JSON,但作为一个整体,它不是有效的JSON值,因为没有*列表或对象定义。

Note that because the file contains JSON per line, you are saved the headaches of trying to parse it all in one go or to figure out a streaming JSON parser. You can now opt to process each line separately before moving on to the next, saving memory in the process. You probably don't want to append each result to one list and then process everything if your file is really big.

请注意,因为该文件每行包含JSON,所以您可以省去尝试一次性解析它或者找出流式JSON解析器的麻烦。您现在可以选择单独处理每一行,然后再继续下一步,从而节省内存。您可能不希望将每个结果附加到一个列表中,然后在文件非常大的情况下处理所有内容。

If you have a file containing individual JSON objects with delimiters in-between, use How do I use the 'json' module to read in one JSON object at a time? to parse out individual objects using a buffered method.

如果您有一个包含单个JSON对象的文件,其间有分隔符,请使用如何使用'json'模块一次读入一个JSON对象?使用缓冲方法解析单个对象。

#2


5  

That is ill-formatted. You have one JSON object per line, but they are not contained in a larger data structure (ie an array). You'll either need to reformat it so that it begins with [ and ends with ] with a comma at the end of each line, or parse it line by line as separate dictionaries.

这是格式不合理的。每行有一个JSON对象,但它们不包含在更大的数据结构(即数组)中。您需要重新格式化它以使其以[并以]结尾处以每行末尾的逗号开头,或者将其逐行解析为单独的词典。

#3


2  

for those stumbling upon this question: the python jsonlines library (much younger than this question) elegantly. handles files with one json document per line. see https://jsonlines.readthedocs.io/

对于那些在这个问题上磕磕绊绊的人:python jsonlines图书馆(比这个问题要年轻得多)优雅。每行处理一个json文档的文件。见https://jsonlines.readthedocs.io/