如何使用Twisted (python)正确解析请求字符串

时间:2022-02-27 00:16:32

I am trying to implement a simple twisted HTTP server which would respond to requests for loading tiles from a database and return them. However I find that the way it interprets request strings quite odd.

我正在尝试实现一个简单的twisted HTTP服务器,它可以响应从数据库加载块的请求并返回它们。但是我发现它解释请求字符串的方式非常奇怪。

This is what I POST to the server:

这是我发给服务器的:

curl -d "request=loadTiles&grid[0][x]=17&grid[0][y]=185&grid[1][x]=18&grid[1][y]=184" http://localhost:8080/fetch/

What I expect the request.args to be:

我期待的请求。参数:

{'request': 'loadTiles', 'grid': [{'x': 17, 'y': 185}, {'x': 18, 'y': 184}]}

How Twisted interprets request.args:

如何扭曲解释request.args:

{'grid[1][y]': ['184'], 'grid[0][y]': ['185'], 'grid[1][x]': ['18'], 'request': ['loadTiles'], 'grid[0][x]': ['17']}

Is it possible to have it automatically parse the request string and create a list for the grid parameter or do I have to do it manually?

是否可能让它自动解析请求字符串并为网格参数创建一个列表,还是我必须手动执行?

I could json encode the grid parameter and then decode it server side, but it seems like an unneccssary hack.

我可以对网格参数进行json编码,然后对其服务器端进行解码,但这似乎是一种不必要的攻击。

2 个解决方案

#1


1  

Maybe instead of a parser, how about something to post-process the request.args you are getting?

也许用一些东西来处理请求,而不是解析器。arg游戏你呢?

from pyparsing import Suppress, alphas, alphanums, nums, Word
from itertools import groupby

# you could do this with regular expressions too, if you prefer
LBRACK,RBRACK = map(Suppress, '[]')
ident = Word('_' + alphas, '_' + alphanums)
integer = Word(nums).setParseAction(lambda t : int(t[0]))
subscriptedRef = ident + 2*(LBRACK + (ident | integer) + RBRACK)


def simplify_value(v):
    if isinstance(v,list) and len(v)==1:
        return simplify_value(v[0])
    if v == integer:
        return int(v)
    return v

def regroup_args(dd):
    ret = {}
    subscripts = []
    for k,v in dd.items():
        # this is a pyparsing short-cut to see if a string matches a pattern
        # I also used it above in simplify_value to test for integerness of a string
        if k == subscriptedRef:
            subscripts.append(tuple(subscriptedRef.parseString(k))+
                                    (simplify_value(v),))
        else:
            ret[k] = simplify_value(v)

    # sort all the matched subscripted args, and then use groupby to
    # group by name and list index
    # this assumes all indexes 0-n are present in the parsed arguments
    subscripts.sort()
    for name,nameitems in groupby(subscripts, key=lambda x:x[0]):
        ret[name] = []
        for idx,idxitems in groupby(nameitems, key=lambda x:x[1]):
            idd = {}
            for item in idxitems:
                name, i, attr, val = item
                idd[attr] = val
            ret[name].append(idd)

    return ret

request_args = {'grid[1][y]': ['184'], 'grid[0][y]': ['185'], 'grid[1][x]': ['18'], 'request': ['loadTiles'], 'grid[0][x]': ['17']} 
print regroup_args(request_args)

prints

打印

{'grid': [{'y': 185, 'x': 17}, {'y': 184, 'x': 18}], 'request': 'loadTiles'}

Note that this also simplifies the single-element lists to just the 0'th element value, and converts the numeric strings to actual integers.

注意,这也将单元素列表简化为第0个元素值,并将数字字符串转换为实际的整数。

#2


4  

I don't know why you would expect your urlencoded data to be decoded according to some ad-hoc non-standard rules, or why you would consider the standard treatment "odd"; [ isn't special in query strings. What software decodes them this way?

我不知道为什么你会期望你的urlencoding数据按照一些特别的非标准规则进行解码,或者为什么你会认为标准的处理是“奇怪的”;在查询字符串中并不特殊。什么软件可以这样解码?

In any event, this isn't really Twisted, but Python (and more generally speaking, the web-standard way of parsing this data). You can see the sort of data you'll get back via the cgi.parse_qs function interactively. For example:

无论如何,这并不是真正的扭曲,而是Python(更一般地说,是解析数据的web标准方式)。你可以看到你将通过cgi得到的数据。parse_qs交互功能。例如:

>>> import cgi
>>> cgi.parse_qs("")
{}
>>> cgi.parse_qs("x=1")
{'x': ['1']}
>>> cgi.parse_qs("x[something]=1")
{'x[something]': ['1']}
>>> cgi.parse_qs("x=1&y=2")
{'y': ['2'], 'x': ['1']}
>>> cgi.parse_qs("x=1&y=2&x=3")
{'y': ['2'], 'x': ['1', '3']}

I hope that clears things up for you.

我希望这能帮你理清头绪。

#1


1  

Maybe instead of a parser, how about something to post-process the request.args you are getting?

也许用一些东西来处理请求,而不是解析器。arg游戏你呢?

from pyparsing import Suppress, alphas, alphanums, nums, Word
from itertools import groupby

# you could do this with regular expressions too, if you prefer
LBRACK,RBRACK = map(Suppress, '[]')
ident = Word('_' + alphas, '_' + alphanums)
integer = Word(nums).setParseAction(lambda t : int(t[0]))
subscriptedRef = ident + 2*(LBRACK + (ident | integer) + RBRACK)


def simplify_value(v):
    if isinstance(v,list) and len(v)==1:
        return simplify_value(v[0])
    if v == integer:
        return int(v)
    return v

def regroup_args(dd):
    ret = {}
    subscripts = []
    for k,v in dd.items():
        # this is a pyparsing short-cut to see if a string matches a pattern
        # I also used it above in simplify_value to test for integerness of a string
        if k == subscriptedRef:
            subscripts.append(tuple(subscriptedRef.parseString(k))+
                                    (simplify_value(v),))
        else:
            ret[k] = simplify_value(v)

    # sort all the matched subscripted args, and then use groupby to
    # group by name and list index
    # this assumes all indexes 0-n are present in the parsed arguments
    subscripts.sort()
    for name,nameitems in groupby(subscripts, key=lambda x:x[0]):
        ret[name] = []
        for idx,idxitems in groupby(nameitems, key=lambda x:x[1]):
            idd = {}
            for item in idxitems:
                name, i, attr, val = item
                idd[attr] = val
            ret[name].append(idd)

    return ret

request_args = {'grid[1][y]': ['184'], 'grid[0][y]': ['185'], 'grid[1][x]': ['18'], 'request': ['loadTiles'], 'grid[0][x]': ['17']} 
print regroup_args(request_args)

prints

打印

{'grid': [{'y': 185, 'x': 17}, {'y': 184, 'x': 18}], 'request': 'loadTiles'}

Note that this also simplifies the single-element lists to just the 0'th element value, and converts the numeric strings to actual integers.

注意,这也将单元素列表简化为第0个元素值,并将数字字符串转换为实际的整数。

#2


4  

I don't know why you would expect your urlencoded data to be decoded according to some ad-hoc non-standard rules, or why you would consider the standard treatment "odd"; [ isn't special in query strings. What software decodes them this way?

我不知道为什么你会期望你的urlencoding数据按照一些特别的非标准规则进行解码,或者为什么你会认为标准的处理是“奇怪的”;在查询字符串中并不特殊。什么软件可以这样解码?

In any event, this isn't really Twisted, but Python (and more generally speaking, the web-standard way of parsing this data). You can see the sort of data you'll get back via the cgi.parse_qs function interactively. For example:

无论如何,这并不是真正的扭曲,而是Python(更一般地说,是解析数据的web标准方式)。你可以看到你将通过cgi得到的数据。parse_qs交互功能。例如:

>>> import cgi
>>> cgi.parse_qs("")
{}
>>> cgi.parse_qs("x=1")
{'x': ['1']}
>>> cgi.parse_qs("x[something]=1")
{'x[something]': ['1']}
>>> cgi.parse_qs("x=1&y=2")
{'y': ['2'], 'x': ['1']}
>>> cgi.parse_qs("x=1&y=2&x=3")
{'y': ['2'], 'x': ['1', '3']}

I hope that clears things up for you.

我希望这能帮你理清头绪。