从.NET中的字符串中提取JSON

时间:2022-10-17 16:02:24

The input string is mix of some text with valid JSON:

输入字符串是一些文本与有效JSON的混合:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<TITLE>Title</TITLE>

<META http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<META HTTP-EQUIV="Content-language" CONTENT="en">
<META HTTP-EQUIV="keywords" CONTENT="search words">
<META HTTP-EQUIV="Expires" CONTENT="0">

<script SRC="include/datepicker.js" LANGUAGE="JavaScript" TYPE="text/javascript"></script>
<script SRC="include/jsfunctions.js" LANGUAGE="JavaScript" TYPE="text/javascript"></script>

<link REL="stylesheet" TYPE="text/css" HREF="css/datepicker.css">

<script language="javascript" type="text/javascript">
function limitText(limitField, limitCount, limitNum) {
    if (limitField.value.length > limitNum) {
        limitField.value = limitField.value.substring(0, limitNum);
    } else {
        limitCount.value = limitNum - limitField.value.length;
    }
}
</script>
{"List":[{"ID":"175114","Number":"28992"]}

The task is to deserialize the JSON part of it into some object. The string can begin with some text, but it surely contains the valid JSON. I've tried to use JSON validation REGEX, but there was a problem parsing such pattern in .NET.
So in the end I'd wanted to get only:

任务是将其JSON部分反序列化为某个对象。该字符串可以以某些文本开头,但它肯定包含有效的JSON。我曾尝试使用JSON验证REGEX,但在.NET中解析此类模式时出现问题。所以最后我只想得到:

{
    "List": [{
        "ID": "175114",
        "Number": "28992"
    }]
}

Clarification 1:
There is only single JSON object in whole the messy string, but the text can contain {}(its actually HTML and can contain javascripts with <script> function(){..... )

澄清1:整个凌乱的字符串中只有一个JSON对象,但是文本可以包含{}(它实际上是HTML,可以包含带有

2 个解决方案

#1


2  

Use regex to find all possible JSON structures:

使用正则表达式查找所有可能的JSON结构:

Regex example

正则表达式的例子

Then iterate all these matches unitil you find a match that will not cause an exception:

然后迭代所有这些匹配单元,你找到一个不会导致异常的匹配:

JsonConvert.SerializeObject(match);

If you know the format of the JSON structure, use JsonSchema.

如果您知道JSON结构的格式,请使用JsonSchema。

#2


4  

You can use this method

您可以使用此方法

    public object ExtractJsonObject(string mixedString)
    {
        for (var i = mixedString.IndexOf('{'); i > -1; i = mixedString.IndexOf('{', i + 1))
        {
            for (var j = mixedString.LastIndexOf('}'); j > -1; j = mixedString.LastIndexOf("}", j -1))
            {
                var jsonProbe = mixedString.Substring(i, j - i + 1);
                try
                {
                    return JsonConvert.DeserializeObject(jsonProbe);
                }
                catch
                {                        
                }
            }
        }
        return null;
    }

The key idea is to search all { and } pairs and probe them, if they contain valid JSON. The first valid JSON occurrence is converted to an object and returned.

关键的想法是搜索所有{和}对并探测它们,如果它们包含有效的JSON。第一个有效的JSON事件将转换为对象并返回。

#1


2  

Use regex to find all possible JSON structures:

使用正则表达式查找所有可能的JSON结构:

Regex example

正则表达式的例子

Then iterate all these matches unitil you find a match that will not cause an exception:

然后迭代所有这些匹配单元,你找到一个不会导致异常的匹配:

JsonConvert.SerializeObject(match);

If you know the format of the JSON structure, use JsonSchema.

如果您知道JSON结构的格式,请使用JsonSchema。

#2


4  

You can use this method

您可以使用此方法

    public object ExtractJsonObject(string mixedString)
    {
        for (var i = mixedString.IndexOf('{'); i > -1; i = mixedString.IndexOf('{', i + 1))
        {
            for (var j = mixedString.LastIndexOf('}'); j > -1; j = mixedString.LastIndexOf("}", j -1))
            {
                var jsonProbe = mixedString.Substring(i, j - i + 1);
                try
                {
                    return JsonConvert.DeserializeObject(jsonProbe);
                }
                catch
                {                        
                }
            }
        }
        return null;
    }

The key idea is to search all { and } pairs and probe them, if they contain valid JSON. The first valid JSON occurrence is converted to an object and returned.

关键的想法是搜索所有{和}对并探测它们,如果它们包含有效的JSON。第一个有效的JSON事件将转换为对象并返回。