如何使用Regex将字符串分割为2D数组?

时间:2022-08-27 07:30:26

I've got a problem that seems simple on the face of it but has defeated my meager regex skills. I have a string that I need to convert to an array and then process the values accordingly, which is simple enough, but the format of the string cannot be changed (it is generated elsewhere) and the logic of it has me baffled.

我有一个问题,表面上看起来很简单,但已经击败了我贫乏的regex技能。我有一个字符串,我需要将它转换为一个数组,然后相应地处理这些值,这很简单,但是字符串的格式不能更改(它是在其他地方生成的),它的逻辑让我感到困惑。

The string is:

字符串:

[6] [2] [3] 12.00; [5] [4]

It's basically a set of ids and decimal values (in this case id 3 == 12.00). The quantity of ids could change at any moment and decimal values could be in any or all of the ids.

它基本上是一组id和十进制值(在本例中id 3 = 12.00)。id的数量可以随时改变,小数的值可以在任何一个或所有的id中。

In an ideal world I would have the following array:

在理想的世界中,我将拥有以下数组:

Array (
   [0] => Array (
             [id]  => 6
             [num] => 
          )
   [1] => Array (
             [id]  => 2
             [num] => 
          ) 
   [2] => Array (
             [id]  => 3
             [num] => 12.00 
          )
   Etc...

Do any of you regex wizards know how this can be accomplished with less swearing than I've been able to achieve?

你们中有谁知道如何用比我更少的咒骂来实现这一点?

I have thus far been able to extract the id's using:

到目前为止,我已经能够提取id的使用:

preg_match_all('@\[(.*?)\]@s', $string, $array);

and the decimals using:

和小数的使用:

preg_match_all('/([0-9]+[,\.]{1}[0-9]{2})/', $string, $array);

but lose the correlation between id's and values.

但是失去了id和值之间的相关性。

5 个解决方案

#1


3  

Example:

例子:

<?php

$string = '[6] [2] [3] 12.00; [5] [4]';

preg_match_all('/\[(?P<id>\d+)\](?: (?P<num>[\d\.]+);)?/', $string, $matches, PREG_SET_ORDER);

var_dump($matches);

Output:

输出:

array(5) {
  [0]=>
  array(3) {
    [0]=>
    string(3) "[6]"
    ["id"]=>
    string(1) "6"
    [1]=>
    string(1) "6"
  }
  [1]=>
  array(3) {
    [0]=>
    string(3) "[2]"
    ["id"]=>
    string(1) "2"
    [1]=>
    string(1) "2"
  }
  [2]=>
  array(5) {
    [0]=>
    string(10) "[3] 12.00;"
    ["id"]=>
    string(1) "3"
    [1]=>
    string(1) "3"
    ["num"]=>
    string(5) "12.00"
    [2]=>
    string(5) "12.00"
  }
  [3]=>
  array(3) {
    [0]=>
    string(3) "[5]"
    ["id"]=>
    string(1) "5"
    [1]=>
    string(1) "5"
  }
  [4]=>
  array(3) {
    [0]=>
    string(3) "[4]"
    ["id"]=>
    string(1) "4"
    [1]=>
    string(1) "4"
  }
}

#2


1  

If you are happy with a list of either IDs or NUMs, then you could just combine your two working regexes into one call:

如果您对id或NUMs列表感到满意,那么您可以将您的两个工作regexes合并为一个调用:

preg_match_all('@  \[(?P<id> \d+ )]   |   (?P<num> [\d,.]+)  @xs',
         $string, $array, PREG_SET_ORDER);

This will give you a list of associative arrays, with either id or num set, if you also use the PREG_SET_ORDER flag.

如果您还使用PREG_SET_ORDER标志,这将为您提供一个关联数组列表,其中包含id或num set。

#3


1  

Something like this? My php skills are rather weak so you will have to check how to access the named capturing groups id/num.

是这样的吗?我的php技巧相当薄弱,因此您必须检查如何访问命名捕获组id/num。

preg_match_all('/\[(?P<id>\d+)\]\s*(?P<num>[-+]?\b[0-9]+(?:\.[0-9]+)?\b)?/', $subject, $result, PREG_SET_ORDER);
for ($matchi = 0; $matchi < count($result); $matchi++) {
    for ($backrefi = 0; $backrefi < count($result[$matchi]); $backrefi++) {
        # Matched text = $result[$matchi][$backrefi];
    } 
}

How it works :

它是如何工作的:

"
\[             # Match the character “[” literally
(?<id>         # Match the regular expression below and capture its match into backreference with name “id”
   \d             # Match a single digit 0..9
      +              # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)
]              # Match the character “]” literally
\s             # Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
   *              # Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
(?<num>        # Match the regular expression below and capture its match into backreference with name “num”
   [-+]           # Match a single character present in the list “-+”
      ?              # Between zero and one times, as many times as possible, giving back as needed (greedy)
   \b             # Assert position at a word boundary
   [0-9]          # Match a single character in the range between “0” and “9”
      +              # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
   (?:            # Match the regular expression below
      \.             # Match the character “.” literally
      [0-9]          # Match a single character in the range between “0” and “9”
         +              # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
   )?             # Between zero and one times, as many times as possible, giving back as needed (greedy)
   \b             # Assert position at a word boundary
)?             # Between zero and one times, as many times as possible, giving back as needed (greedy)
"

It also takes care of negative values.

它也关心负值。

#4


0  

Take a look at the php explode command - http://php.net/manual/en/function.explode.php

让我们来看看php的防爆命令:http://php.net/manual/en/function.爆炸物

#5


0  

Its not the regex approach but maybe it works for you: (of course it could be improved)

这不是regex方法,但可能对您有用:(当然可以改进)

$str = "[6] [2] [3] 12.00; [5] [4]";
$str = str_replace(array('[',']'), '', $str);

$arr = explode(' ', $str);
$array = array();
for($i=0 ; $i < count($arr) ; $i++)
{   
    $isValue = strpos($arr[$i], '.');
    if($isValue !== false){
        continue;
    }   

    $key = $arr[$i];
    $ret = array( 'id' => $key , 'num' => '');

    $nextIsFloat = strstr($arr[$i+1], ';', TRUE);
    if(!$nextIsFloat){
        $array[] = $ret;        
        continue;
    }else{
        $ret['num'] = $nextIsFloat;
        $array[] = $ret;
        $i++;       
    }
}

#1


3  

Example:

例子:

<?php

$string = '[6] [2] [3] 12.00; [5] [4]';

preg_match_all('/\[(?P<id>\d+)\](?: (?P<num>[\d\.]+);)?/', $string, $matches, PREG_SET_ORDER);

var_dump($matches);

Output:

输出:

array(5) {
  [0]=>
  array(3) {
    [0]=>
    string(3) "[6]"
    ["id"]=>
    string(1) "6"
    [1]=>
    string(1) "6"
  }
  [1]=>
  array(3) {
    [0]=>
    string(3) "[2]"
    ["id"]=>
    string(1) "2"
    [1]=>
    string(1) "2"
  }
  [2]=>
  array(5) {
    [0]=>
    string(10) "[3] 12.00;"
    ["id"]=>
    string(1) "3"
    [1]=>
    string(1) "3"
    ["num"]=>
    string(5) "12.00"
    [2]=>
    string(5) "12.00"
  }
  [3]=>
  array(3) {
    [0]=>
    string(3) "[5]"
    ["id"]=>
    string(1) "5"
    [1]=>
    string(1) "5"
  }
  [4]=>
  array(3) {
    [0]=>
    string(3) "[4]"
    ["id"]=>
    string(1) "4"
    [1]=>
    string(1) "4"
  }
}

#2


1  

If you are happy with a list of either IDs or NUMs, then you could just combine your two working regexes into one call:

如果您对id或NUMs列表感到满意,那么您可以将您的两个工作regexes合并为一个调用:

preg_match_all('@  \[(?P<id> \d+ )]   |   (?P<num> [\d,.]+)  @xs',
         $string, $array, PREG_SET_ORDER);

This will give you a list of associative arrays, with either id or num set, if you also use the PREG_SET_ORDER flag.

如果您还使用PREG_SET_ORDER标志,这将为您提供一个关联数组列表,其中包含id或num set。

#3


1  

Something like this? My php skills are rather weak so you will have to check how to access the named capturing groups id/num.

是这样的吗?我的php技巧相当薄弱,因此您必须检查如何访问命名捕获组id/num。

preg_match_all('/\[(?P<id>\d+)\]\s*(?P<num>[-+]?\b[0-9]+(?:\.[0-9]+)?\b)?/', $subject, $result, PREG_SET_ORDER);
for ($matchi = 0; $matchi < count($result); $matchi++) {
    for ($backrefi = 0; $backrefi < count($result[$matchi]); $backrefi++) {
        # Matched text = $result[$matchi][$backrefi];
    } 
}

How it works :

它是如何工作的:

"
\[             # Match the character “[” literally
(?<id>         # Match the regular expression below and capture its match into backreference with name “id”
   \d             # Match a single digit 0..9
      +              # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)
]              # Match the character “]” literally
\s             # Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
   *              # Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
(?<num>        # Match the regular expression below and capture its match into backreference with name “num”
   [-+]           # Match a single character present in the list “-+”
      ?              # Between zero and one times, as many times as possible, giving back as needed (greedy)
   \b             # Assert position at a word boundary
   [0-9]          # Match a single character in the range between “0” and “9”
      +              # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
   (?:            # Match the regular expression below
      \.             # Match the character “.” literally
      [0-9]          # Match a single character in the range between “0” and “9”
         +              # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
   )?             # Between zero and one times, as many times as possible, giving back as needed (greedy)
   \b             # Assert position at a word boundary
)?             # Between zero and one times, as many times as possible, giving back as needed (greedy)
"

It also takes care of negative values.

它也关心负值。

#4


0  

Take a look at the php explode command - http://php.net/manual/en/function.explode.php

让我们来看看php的防爆命令:http://php.net/manual/en/function.爆炸物

#5


0  

Its not the regex approach but maybe it works for you: (of course it could be improved)

这不是regex方法,但可能对您有用:(当然可以改进)

$str = "[6] [2] [3] 12.00; [5] [4]";
$str = str_replace(array('[',']'), '', $str);

$arr = explode(' ', $str);
$array = array();
for($i=0 ; $i < count($arr) ; $i++)
{   
    $isValue = strpos($arr[$i], '.');
    if($isValue !== false){
        continue;
    }   

    $key = $arr[$i];
    $ret = array( 'id' => $key , 'num' => '');

    $nextIsFloat = strstr($arr[$i+1], ';', TRUE);
    if(!$nextIsFloat){
        $array[] = $ret;        
        continue;
    }else{
        $ret['num'] = $nextIsFloat;
        $array[] = $ret;
        $i++;       
    }
}