如何遍历文件并将所有id值放入列表[Python]

时间:2022-03-08 21:42:32
{'asin': '0756029929', 'description': 'Spanish Third-year Pin, 1 inch in diameter.  Set of 10.', 'title': 'Spanish Third-year Pin Set of 10', 'price': 11.45, 'imUrl': 'http://ecx.images-amazon.com/images/I/51AqSOl7qLL._SY300_.jpg', 'salesRank': {'Toys & Games': 918374}, 'categories': [['Clothing, Shoes & Jewelry', 'Novelty, Costumes & More', 'Novelty', 'Clothing']]}
{'asin': '0756029104', 'description': 'Viva Espaol pin, 1 x 1 inch. Set of 10.', 'title': 'Viva Espanol Pins Set of 10', 'price': 11.45, 'imUrl': 'http://ecx.images-amazon.com/images/I/51By%2BZpF9DL._SY300_.jpg', 'salesRank': {'Home & Kitchen': 2651070}, 'categories': [['Clothing, Shoes & Jewelry', 'Novelty, Costumes & More', 'Novelty', 'Clothing']]}
{'asin': '0839933363', 'description': 'This necklace from the popular manga and anime series, Death Note. The necklace\'s charm is black and silver with the text, "Death Note" upon it. The approx. length of the necklace is 12"Dimension & Measurement:Length: Approx. 12"', 'title': 'Death Note Anime Manga: Cross Logo necklace', 'imUrl': 'http://ecx.images-amazon.com/images/I/51f0HkHssyL._SY300_.jpg', 'salesRank': {'Toys & Games': 1350779}, 'categories': [['Clothing, Shoes & Jewelry', 'Novelty, Costumes & More', 'Costumes & Accessories']]}
{'asin': '1304567583', 'description': 'pink bikini swimwear glow in the dark fashion', 'title': 'Pink Bikini Swimwear Glow in the Dark Fashion', 'price': 19.99, 'imUrl': 'http://ecx.images-amazon.com/images/I/41pS%2B98jhlL._SY300_.jpg', 'categories': [['Clothing, Shoes & Jewelry', 'Novelty, Costumes & More', 'Costumes & Accessories', 'Costumes']]}
{'asin': '1304567613', 'description': 'Bikini Swimwear glow in the dark fashion blue', 'title': 'Bikini Swimwear Blue Glow in the Dark Fashion', 'price': 29.99, 'imUrl': 'http://ecx.images-amazon.com/images/I/41ZNIUvYkyL._SY300_.jpg', 'categories': [['Clothing, Shoes & Jewelry', 'Novelty, Costumes & More', 'Costumes & Accessories', 'Costumes']]}
{'asin': '1465014578', 'title': '2013 Desk Pad Calendar', 'imUrl': 'http://ecx.images-amazon.com/images/I/51NadiHHHsL._SX342_.jpg', 'related': {'also_bought': ['B009SDBX0Q', 'B009DCUY1G'], 'bought_together': ['B009SDBX0Q', 'B009DCUY1G']}, 'salesRank': {'Clothing': 505645}, 'categories': [['Clothing, Shoes & Jewelry', 'Novelty, Costumes & More', 'Band & Music Fan', 'Accessories']]}
{'asin': '1620574128', 'related': {'also_bought': ['B0015KET88', 'B00004WKPP', 'B000F8T8U0', 'B000F8V736', 'B000F8VAOM', 'B0015KGFQM', 'B003U6P4OS', '1564519392', 'B000F8XF8Q', 'B0042SR3E2', 'B004PBLVDU', 'B000G3LR9Y', 'B0006PKZBI', 'B0007PC9CK', 'B001G98DS0', 'B001UFWJLW', 'B003S8XLWA', '0486214834', '1609964713', 'B000P1PVMQ', '0590308572', 'B000QDZY52', '1564514188', 'B0006PKZ7W', 'B000T2YKIM', 'B000QDTWF0', 'B000FA6DXS', 'B0007P94ZA', 'B000WA3FKU', 'B00004WKPU', 'B000F8XF68', 'B004DJ51JE', '

I have this file and want to put all the values of asins into there own list, Here is what I have so far but I just can't figure out how to do this or what is the best way to do it because the file has an extension of .json but it is not in the format of vaild json hence why im trying to do it like a normal text file.

我有这个文件,想把所有最佳的值到有自己的列表,这是我迄今为止,但我不知道如何做到这一点,最好的方法是什么,因为文件的扩展. json格式的但它不是有效的json因此为什么我试图像普通文本文件。

with open('File.json', "r") as f:
    for line in f:
         if 'asin' in line:
        #Code that gets the values of asins
         clothing_ids.append(#then add them values to clothing_ids)

print(clothing_ids)

2 个解决方案

#1


2  

If the file is guaranteed untainted, you could eval each line:

如果文件保证没有被污染,您可以对每一行进行eval:

with open('File.json', "r") as f:
    asins = [eval(line)['asin'] for line in f]

Here is the same code, using @Ajax1234's ast.literal_eval(), to avoid issue with tainted files, but using my list comprehension, which evaluates each line individually, and so doesn't store the whole dataset as a temporary.

下面是相同的代码,使用@Ajax1234的ast.literal_eval()避免了污染文件的问题,但是使用我的列表理解,它分别计算每一行,因此不会将整个数据集存储为临时数据集。

import ast

with open('File.json', "r") as f:
    asins = [ast.literal_eval(line)['asin'] for line in f]

For your comment-based bonus question, getting a list of all "also_bought" items, including duplicates:

对于基于评论的奖金问题,获取所有“also_buy”项目的列表,包括副本:

import ast

also_bought = []
asins = []
with open('File.json', "r") as f:
    for line in f:
        item = ast.literal_eval(line)
        asins.append(item['asin'])
        if 'related' in item:
           related = item['related']
           if 'also_bought' in related:
              also_bought.extend(related['also_bought'])

#2


2  

Use list comprehension with ast.literal_eval:

使用list comprehension with ast.literal_eval:

import ast
data = ast.literal_eval(open('File.json').read())
asins = [i['asin'] for i in data]

#1


2  

If the file is guaranteed untainted, you could eval each line:

如果文件保证没有被污染,您可以对每一行进行eval:

with open('File.json', "r") as f:
    asins = [eval(line)['asin'] for line in f]

Here is the same code, using @Ajax1234's ast.literal_eval(), to avoid issue with tainted files, but using my list comprehension, which evaluates each line individually, and so doesn't store the whole dataset as a temporary.

下面是相同的代码,使用@Ajax1234的ast.literal_eval()避免了污染文件的问题,但是使用我的列表理解,它分别计算每一行,因此不会将整个数据集存储为临时数据集。

import ast

with open('File.json', "r") as f:
    asins = [ast.literal_eval(line)['asin'] for line in f]

For your comment-based bonus question, getting a list of all "also_bought" items, including duplicates:

对于基于评论的奖金问题,获取所有“also_buy”项目的列表,包括副本:

import ast

also_bought = []
asins = []
with open('File.json', "r") as f:
    for line in f:
        item = ast.literal_eval(line)
        asins.append(item['asin'])
        if 'related' in item:
           related = item['related']
           if 'also_bought' in related:
              also_bought.extend(related['also_bought'])

#2


2  

Use list comprehension with ast.literal_eval:

使用list comprehension with ast.literal_eval:

import ast
data = ast.literal_eval(open('File.json').read())
asins = [i['asin'] for i in data]