在PHP中匹配模式时将文本拆分为行[重复]

时间:2022-09-13 00:22:58

Possible Duplicate:
Splitting string array based upon digits in php?

可能重复:基于php中的数字拆分字符串数组?

I have a set of data that's all in one big chunk of text. It looks similar to the following;

我有一组数据都在一大块文本中。它看起来类似于以下内容;

01/02 10:45:01 test data 01/03 11:52:09 test data 01/04 18:63:05 test data 01/04 21:12:09 test data 01/04 13:10:07 test data 01/05 07:08:09 test data 01/05 10:07:08 test data 01/05 08:00:09 test data 01/06 11:01:09 test data

01/02 10:45:01测试数据01/03 11:52:09测试数据01/04 18:63:05测试数据01/04 21:12:09测试数据01/04 13:10:07测试数据01/05 07:08:09测试数据01/05 10:07:08测试数据01/05 08:00:09测试数据01/06 11:01:09测试数据

I'm trying to simply make this readable (see below for example), but the only thing on each of the lines that's remotely similar is that the start follows a 00/00 pattern.

我试图简单地使这个可读(参见下面的例子),但是在每个远程相似的行上唯一的事情是开始遵循00/00模式。

01/02 10:45:01 test data 
01/03 11:52:09 test data 
01/04 18:63:05 test data 
01/04 21:12:09 test data 
01/04 13:10:07 test data 
01/05 07:08:09 test data 
01/05 10:07:08 test data 
01/05 08:00:09 test data 
01/06 11:01:09 test data 

I've gotten as far as splitting it out by matching it to a regex pattern;

通过将它与正则表达式模式匹配,我已经把它拆分了。

$split = preg_split("/\d+\\/\d+ /", $contents, -1, PREG_SPLIT_NO_EMPTY);

And this outputs;

而这个输出;

Array ( [0] => 
        [1] => 10:45:01 test data 
        [2] => 11:52:09 test data 
        [3] => 18:63:05 test data 
        [4] => 18:63:05 test data 
        ...and so on

But as you can see the problem is that preg_split isn't keeping the delimeter. I've tried changing the preg_split to;

但是你可以看到问题是preg_split没有保留分隔符。我已经尝试将preg_split更改为;

$split = preg_split("/\d+\\/\d+ /", $contents, -1, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE

However this returns the same as above, with no 00/00 at the start of the line.

然而,这返回与上面相同,在行的开头没有00/00。

Have I done something wrong or is their a better way of achieving this?

我做错了什么或是他们更好的方法来实现这个目标吗?

2 个解决方案

#1


4  

You can tell preg_split() to split at any point in the string which is followed by digits-slash-digits by using a lookahead assertion.

您可以通过使用前瞻断言告诉preg_split()在字符串中的任何位置拆分,后面跟有数字斜杠数字。

$result = preg_split('#(?=\d+/\d+)#', $contents, -1, PREG_SPLIT_NO_EMPTY);

The PREG_SPLIT_NO_EMPTY flag is used because the very start of the string is also a point where there are three digits, so an empty split happens here. We could alter the regex to not split at the very start of the string but that would make it a little more difficult to understand at-a-glance, whereas the flag is very clear.

使用PREG_SPLIT_NO_EMPTY标志是因为字符串的开头也是有三个数字的点,所以这里发生空分割。我们可以改变正则表达式,使其不会在字符串的最开始分割,但这会让它一目了然地变得难以理解,而标志非常清晰。

#2


2  

PHP:

<?php

$text = '01/02 10:45:01 test data 01/03 11:52:09 test data 01/04 18:63:05 test data 01/04 21:12:09 test data 01/04 13:10:07 test data 01/05 07:08:09 test data 01/05 10:07:08 test data 01/05 08:00:09 test data 01/06 11:01:09 test data';

$text = preg_replace('/(\d{2})\/(\d{2})(.*)/U', PHP_EOL . "$0", $text);

echo $text;

Output:

01/02 10:45:01 test data 
01/03 11:52:09 test data 
01/04 18:63:05 test data 
01/04 21:12:09 test data 
01/04 13:10:07 test data 
01/05 07:08:09 test data 
01/05 10:07:08 test data 
01/05 08:00:09 test data 
01/06 11:01:09 test data

Demo

#1


4  

You can tell preg_split() to split at any point in the string which is followed by digits-slash-digits by using a lookahead assertion.

您可以通过使用前瞻断言告诉preg_split()在字符串中的任何位置拆分,后面跟有数字斜杠数字。

$result = preg_split('#(?=\d+/\d+)#', $contents, -1, PREG_SPLIT_NO_EMPTY);

The PREG_SPLIT_NO_EMPTY flag is used because the very start of the string is also a point where there are three digits, so an empty split happens here. We could alter the regex to not split at the very start of the string but that would make it a little more difficult to understand at-a-glance, whereas the flag is very clear.

使用PREG_SPLIT_NO_EMPTY标志是因为字符串的开头也是有三个数字的点,所以这里发生空分割。我们可以改变正则表达式,使其不会在字符串的最开始分割,但这会让它一目了然地变得难以理解,而标志非常清晰。

#2


2  

PHP:

<?php

$text = '01/02 10:45:01 test data 01/03 11:52:09 test data 01/04 18:63:05 test data 01/04 21:12:09 test data 01/04 13:10:07 test data 01/05 07:08:09 test data 01/05 10:07:08 test data 01/05 08:00:09 test data 01/06 11:01:09 test data';

$text = preg_replace('/(\d{2})\/(\d{2})(.*)/U', PHP_EOL . "$0", $text);

echo $text;

Output:

01/02 10:45:01 test data 
01/03 11:52:09 test data 
01/04 18:63:05 test data 
01/04 21:12:09 test data 
01/04 13:10:07 test data 
01/05 07:08:09 test data 
01/05 10:07:08 test data 
01/05 08:00:09 test data 
01/06 11:01:09 test data

Demo