谷歌App引擎NDB数据库损坏?

时间:2022-03-15 19:16:33

I keep getting random errors like:

我不断地得到随机的错误,比如:

suspended generator _get_tasklet(context.py:329) raised ProtocolBufferDecodeError(corrupted)

or

suspended generator put(context.py:796) raised ValueError(Expecting , delimiter: line 1 column 440 (char 440))

or

suspended generator put(context.py:796) raised ValueError(Invalid \escape: line 1 column 18002 (char 18002))

or

suspended generator _get_tasklet(context.py:329) raised ProtocolBufferDecodeError(truncated)

Everything was working fine up until a couple of days ago, and I haven’t made any changes. When I restart my app, everything is fine for about five minutes until I get a

直到几天前,一切都很正常,我还没有做任何改变。当我重新启动我的应用程序时,一切都好了大约五分钟,直到我得到a

suspended generator _get_tasklet(context.py:329) raised ProtocolBufferDecodeError(corrupted)

After that point, I get one of the other errors on every database put or get. The table and code that causes the error is different every time. I have not idea where to begin, since the error is in a new place every time. These are just regular database puts and gets, like

在这之后,我将在每个数据库put或get上获得另一个错误。导致错误的表和代码每次都是不同的。我不知道从哪里开始,因为错误每次都在一个新的地方。这些只是常规的数据库put和get

ndbstate = NdbStateJ.get_by_id(self.screen_name)

or

ndbstate.put()

Google searches haven’t been able to point me in any particular directions. Any ideas? The error

谷歌搜索没能给我指明任何特定的方向。什么好主意吗?这个错误

Expecting , delimiter: line 1 column 440 (char 440)

might be because some of the field types in some of the tables are JSON. But why all the sudden?

可能是因为某些表中的字段类型是JSON。但为什么突然之间?

So maybe I'm not escaping properly somewhere, like by using r'{...}', but if there is a bad entry in there somewhere, how do I fix it if I can't query? And why does it break the whole table for all queries? And why is it random. It's not the same query every time.

所以也许我没有在某个地方正确地转义,比如使用r'{…}',但是如果某个地方有一个坏条目,如果我不能查询,我该如何修复它?为什么它会破坏所有查询的整个表?为什么是随机的。它不是每次都相同的查询。

Here’s an example of a table

这是一个表的例子。

class NdbStateJ(ndb.Model):
    last_id = ndb.IntegerProperty()
    last_search_id = ndb.IntegerProperty()
    last_geo_id = ndb.IntegerProperty()
    mytweet_num = ndb.IntegerProperty()
    mentions_processed = ndb.JsonProperty()
    previous_follower_responses = ndb.JsonProperty()
    my_tweets_tweeted = ndb.JsonProperty()
    responses_already_used = ndb.JsonProperty()
    num_followed_by_cyborg = ndb.IntegerProperty(default=0)
    num_did_not_follow_back = ndb.IntegerProperty(default=0)
    language_model_vector = ndb.FloatProperty(repeated=True)
    follow_wait_counter = ndb.IntegerProperty(default=0)

Here’s an example of creating a table

这里有一个创建表的例子

ndbstate = NdbStateJ(id=screen_name,
last_id = 37397357946732541,
last_geo_id = 37397357946732541,
last_search_id = 0,
mytweet_num = 0,
mentions_processed = [],
previous_follower_responses = [],
my_tweets_tweeted = [],
responses_already_used= [],
language_model_vector = [])
ndbstate.put()

1 个解决方案

#1


1  

It was malformed JSON in the database causing the problem. I don't know why suddenly the problem started happening everywhere; maybe something changed on the Google side, or maybe I wasn't checking sufficiently, and new users were able to enter in malformed data. Who knows.

数据库中的JSON格式出现了问题。我不知道为什么这个问题突然开始到处发生;也许谷歌端的某些东西发生了变化,或者我检查得不够充分,新用户可以输入畸形数据。谁知道呢。

To fix it, I took inspiration from https://*.com/users/1011633/nizz responding to App Engine return JSON from JsonProperty, https://*.com/users/1709587/mark-amery responding to How to escape special characters in building a JSON string?, and https://*.com/users/1639625/tobias-k responding to How do I automatically fix an invalid JSON string?.

为了修复它,我从https://*.com/users/1011633/nizz响应来自JsonProperty的应用引擎返回JSON, https://*.com/users/1709587/mark-amery响应如何在构建JSON字符串时转义特殊字符?,以及https://*.com/users/1639625/tobias-k,以响应如何自动修复无效的JSON字符串?

I replaced ndb.JsonProperty() with ExtendedJsonProperty where the extended version looks similar to the code below.

我用ExtendedJsonProperty()替换了ndb.JsonProperty(),扩展后的版本与下面的代码类似。

import json
from google.appengine.ext import ndb 
import logging
logging.getLogger().setLevel(logging.DEBUG)
import re

class ExtendedJsonProperty(ndb.BlobProperty):
    # Inspired by https://*.com/questions/18576556/app-engine-return-json-from-jsonproperty
    def _to_base_type(self, value):
        logging.debug('Dumping value '+str(value))
        try:
            return json.dumps(value) 
        except Exception as e:
            logging.warning(('trying to fix error dumping from database: ') +str(e))
            return fix_json(value,json.dumps)

    def _from_base_type(self, value):
        # originally return json.loads(value)
        logging.debug('Loading value '+str(value))
        try:
            return json.loads(value)
        except Exception as e:
            logging.warning(('trying to fix error loading from database: ') +str(e))
            return fix_json(value,json.loads)        

def fix_json(s,json_fun):
    for _i in range(len(s)):
        try:
            result = json_fun(s)   # try to parse...
            return result                    
        except Exception as e:  
            logging.debug('Exception for json loads: '+str(e))          
            if 'delimiter' in str(e):
                # E.g.: "Expecting , delimiter: line 34 column 54 (char 1158)"
                logging.debug('Escaping quote to fix.')
                s = escape_quote(s,e)
            elif 'escape' in str(e):
                # E.g.: "Invalid \escape: line 1 column 9 (char 9)"
                logging.debug('Removing invalid escape to fix.')
                s = remove_invalid_escape(s)
            else:
                break
    return json_fun('{}')

def remove_invalid_escape(value):
    # Inspired by https://*.com/questions/19176024/how-to-escape-special-characters-in-building-a-json-string
    return re.sub(r'\\(?!["\\/bfnrt])', '', value)

def escape_quote(s,e):
    # Inspired by https://*.com/questions/18514910/how-do-i-automatically-fix-an-invalid-json-string
    # "Expecting , delimiter: line 34 column 54 (char 1158)"
    # position of unexpected character after '"'
    unexp = int(re.findall(r'\(char (\d+)\)', str(e))[0])
    # position of unescaped '"' before that
    unesc = s.rfind(r'"', 0, unexp)
    s = s[:unesc] + r'\"' + s[unesc+1:]
    # position of corresponding closing '"' (+2 for inserted '\')
    closg = s.find(r'"', unesc + 2)
    if closg + 2 < len(s):
        print closg, len(s)
        s = s[:closg] + r'\"' + s[closg+1:]
    return s

#1


1  

It was malformed JSON in the database causing the problem. I don't know why suddenly the problem started happening everywhere; maybe something changed on the Google side, or maybe I wasn't checking sufficiently, and new users were able to enter in malformed data. Who knows.

数据库中的JSON格式出现了问题。我不知道为什么这个问题突然开始到处发生;也许谷歌端的某些东西发生了变化,或者我检查得不够充分,新用户可以输入畸形数据。谁知道呢。

To fix it, I took inspiration from https://*.com/users/1011633/nizz responding to App Engine return JSON from JsonProperty, https://*.com/users/1709587/mark-amery responding to How to escape special characters in building a JSON string?, and https://*.com/users/1639625/tobias-k responding to How do I automatically fix an invalid JSON string?.

为了修复它,我从https://*.com/users/1011633/nizz响应来自JsonProperty的应用引擎返回JSON, https://*.com/users/1709587/mark-amery响应如何在构建JSON字符串时转义特殊字符?,以及https://*.com/users/1639625/tobias-k,以响应如何自动修复无效的JSON字符串?

I replaced ndb.JsonProperty() with ExtendedJsonProperty where the extended version looks similar to the code below.

我用ExtendedJsonProperty()替换了ndb.JsonProperty(),扩展后的版本与下面的代码类似。

import json
from google.appengine.ext import ndb 
import logging
logging.getLogger().setLevel(logging.DEBUG)
import re

class ExtendedJsonProperty(ndb.BlobProperty):
    # Inspired by https://*.com/questions/18576556/app-engine-return-json-from-jsonproperty
    def _to_base_type(self, value):
        logging.debug('Dumping value '+str(value))
        try:
            return json.dumps(value) 
        except Exception as e:
            logging.warning(('trying to fix error dumping from database: ') +str(e))
            return fix_json(value,json.dumps)

    def _from_base_type(self, value):
        # originally return json.loads(value)
        logging.debug('Loading value '+str(value))
        try:
            return json.loads(value)
        except Exception as e:
            logging.warning(('trying to fix error loading from database: ') +str(e))
            return fix_json(value,json.loads)        

def fix_json(s,json_fun):
    for _i in range(len(s)):
        try:
            result = json_fun(s)   # try to parse...
            return result                    
        except Exception as e:  
            logging.debug('Exception for json loads: '+str(e))          
            if 'delimiter' in str(e):
                # E.g.: "Expecting , delimiter: line 34 column 54 (char 1158)"
                logging.debug('Escaping quote to fix.')
                s = escape_quote(s,e)
            elif 'escape' in str(e):
                # E.g.: "Invalid \escape: line 1 column 9 (char 9)"
                logging.debug('Removing invalid escape to fix.')
                s = remove_invalid_escape(s)
            else:
                break
    return json_fun('{}')

def remove_invalid_escape(value):
    # Inspired by https://*.com/questions/19176024/how-to-escape-special-characters-in-building-a-json-string
    return re.sub(r'\\(?!["\\/bfnrt])', '', value)

def escape_quote(s,e):
    # Inspired by https://*.com/questions/18514910/how-do-i-automatically-fix-an-invalid-json-string
    # "Expecting , delimiter: line 34 column 54 (char 1158)"
    # position of unexpected character after '"'
    unexp = int(re.findall(r'\(char (\d+)\)', str(e))[0])
    # position of unescaped '"' before that
    unesc = s.rfind(r'"', 0, unexp)
    s = s[:unesc] + r'\"' + s[unesc+1:]
    # position of corresponding closing '"' (+2 for inserted '\')
    closg = s.find(r'"', unesc + 2)
    if closg + 2 < len(s):
        print closg, len(s)
        s = s[:closg] + r'\"' + s[closg+1:]
    return s