在PHP中突出显示两个字符串之间的差异

时间:2022-09-11 22:30:00

What is the easiest way to highlight the difference between two strings in PHP?

要突出PHP中两个字符串之间的区别,最简单的方法是什么?

I'm thinking along the lines of the Stack Overflow edit history page, where new text is in green and removed text is in red. If there are any pre-written functions or classes available, that would be ideal.

我正在考虑堆栈溢出编辑历史页面的行,其中新文本是绿色的,删除的文本是红色的。如果有任何预先编写的函数或类可用,那将是理想的。

12 个解决方案

#1


38  

You can use the PHP Horde_Text_Diff package. It suits your needs, and is quite customisable as well.

您可以使用PHP Horde_Text_Diff包。它适合您的需要,而且是可以定制的。

It's also licensed under the GPL, so Enjoy!

它也是在GPL下授权的,所以享受吧!

#2


70  

Just wrote a class to compute smallest (not to be taken literally) number of edits to transform one string into another string:

只是写了一个类来计算最小的编辑数(不是按字面意思)将一个字符串转换成另一个字符串:

http://www.raymondhill.net/finediff/

http://www.raymondhill.net/finediff/

It has a static function to render a HTML version of the diff.

它有一个静态函数来呈现diff的HTML版本。

It's a first version, and likely to be improved, but it works just fine as of now, so I am throwing it out there in case someone needs to generate a compact diff efficiently, like I needed.

这是第一个版本,很可能会得到改进,但是到目前为止它还可以工作,所以我把它扔在那里,以防有人需要高效地生成一个紧凑的diff,就像我需要的那样。

Edit: It's on Github now: https://github.com/gorhill/PHP-FineDiff

编辑:现在在Github上:https://github.com/gorhill/PHP-FineDiff。

#3


25  

If you want a robust library, Text_Diff (a PEAR package) looks to be pretty good. It has some pretty cool features.

如果您想要一个健壮的库,Text_Diff (PEAR包)看起来相当不错。它有一些很酷的特点。

#4


23  

This is a nice one, also http://paulbutler.org/archives/a-simple-diff-algorithm-in-php/

这是一个很好的例子,http://paulbutler.org/archives/a-simple-diff-algorithm-in-php/

Solving the problem is not as simple as it seems, and the problem bothered me for about a year before I figured it out. I managed to write my algorithm in PHP, in 18 lines of code. It is not the most efficient way to do a diff, but it is probably the easiest to understand.

解决问题并不像看上去的那么简单,这个问题困扰了我大约一年,直到我发现它。我用PHP编写了算法,用了18行代码。这不是最有效的方法,但它可能是最容易理解的。

It works by finding the longest sequence of words common to both strings, and recursively finding the longest sequences of the remainders of the string until the substrings have no words in common. At this point it adds the remaining new words as an insertion and the remaining old words as a deletion.

它通过查找两个字符串共同的最长的单词序列来工作,并递归地查找字符串的剩余单词的最长序列,直到子字符串没有共同的单词为止。此时,它将剩下的新单词作为插入,其余的旧单词作为删除。

You can download the source here: PHP SimpleDiff...

您可以在这里下载源代码:PHP SimpleDiff…

#5


9  

Here is a short function you can use to diff two arrays. It implements the LCS algorithm:

这里有一个简短的函数可以用来分割两个数组。实现了LCS算法:

function computeDiff($from, $to)
{
    $diffValues = array();
    $diffMask = array();

    $dm = array();
    $n1 = count($from);
    $n2 = count($to);

    for ($j = -1; $j < $n2; $j++) $dm[-1][$j] = 0;
    for ($i = -1; $i < $n1; $i++) $dm[$i][-1] = 0;
    for ($i = 0; $i < $n1; $i++)
    {
        for ($j = 0; $j < $n2; $j++)
        {
            if ($from[$i] == $to[$j])
            {
                $ad = $dm[$i - 1][$j - 1];
                $dm[$i][$j] = $ad + 1;
            }
            else
            {
                $a1 = $dm[$i - 1][$j];
                $a2 = $dm[$i][$j - 1];
                $dm[$i][$j] = max($a1, $a2);
            }
        }
    }

    $i = $n1 - 1;
    $j = $n2 - 1;
    while (($i > -1) || ($j > -1))
    {
        if ($j > -1)
        {
            if ($dm[$i][$j - 1] == $dm[$i][$j])
            {
                $diffValues[] = $to[$j];
                $diffMask[] = 1;
                $j--;  
                continue;              
            }
        }
        if ($i > -1)
        {
            if ($dm[$i - 1][$j] == $dm[$i][$j])
            {
                $diffValues[] = $from[$i];
                $diffMask[] = -1;
                $i--;
                continue;              
            }
        }
        {
            $diffValues[] = $from[$i];
            $diffMask[] = 0;
            $i--;
            $j--;
        }
    }    

    $diffValues = array_reverse($diffValues);
    $diffMask = array_reverse($diffMask);

    return array('values' => $diffValues, 'mask' => $diffMask);
}

It generates two arrays:

它生成两个数组:

  • values array: a list of elements as they appear in the diff.
  • 值数组:元素在diff中出现时的列表。
  • mask array: contains numbers. 0: unchanged, -1: removed, 1: added.
  • 面具数组:包含数字。0:不变,-1:删除,1:添加。

If you populate an array with characters, it can be used to compute inline difference. Now just a single step to highlight the differences:

如果用字符填充数组,可以使用它来计算行内差异。现在只需一步就能突出差异:

function diffline($line1, $line2)
{
    $diff = computeDiff(str_split($line1), str_split($line2));
    $diffval = $diff['values'];
    $diffmask = $diff['mask'];

    $n = count($diffval);
    $pmc = 0;
    $result = '';
    for ($i = 0; $i < $n; $i++)
    {
        $mc = $diffmask[$i];
        if ($mc != $pmc)
        {
            switch ($pmc)
            {
                case -1: $result .= '</del>'; break;
                case 1: $result .= '</ins>'; break;
            }
            switch ($mc)
            {
                case -1: $result .= '<del>'; break;
                case 1: $result .= '<ins>'; break;
            }
        }
        $result .= $diffval[$i];

        $pmc = $mc;
    }
    switch ($pmc)
    {
        case -1: $result .= '</del>'; break;
        case 1: $result .= '</ins>'; break;
    }

    return $result;
}

Eg.:

如:

echo diffline('*', 'ServerFault')

Will output:

将输出:

S<del>tackO</del><ins>er</ins>ver<del>f</del><ins>Fau</ins>l<del>ow</del><ins>t</ins> 

StackOerverfFaulowt

年代tackOerver fFaul owt

Additional notes:

其他说明:

  • The diff matrix requires (m+1)*(n+1) elements. So you can run into out of memory errors if you try to diff long sequences. In this case diff larger chunks (eg. lines) first, then diff their contents in a second pass.
  • diff矩阵需要(m+1)*(n+1)元素。因此,如果你试图分割长序列,你可能会遇到内存错误。在这种情况下,分割更大的块。行)首先,然后在第二遍中修改内容。
  • The algorithm can be improved if you trim the matching elements from the beginning and the end, then run the algorithm on the differing middle only. A latter (more bloated) version contains these modifications too.
  • 如果从开始到结束都对匹配元素进行修剪,然后只在不同的中间运行算法,则可以对算法进行改进。后一个(更膨胀的)版本也包含这些修改。

#6


6  

There is also a PECL extension for xdiff:

xdiff还有一个PECL扩展:

In particular:

特别是:

  • xdiff_string_diff — Make unified diff of two strings
  • xdiff_string_diff -创建两个字符串的统一diff

Example from PHP Manual:

PHP手册的例子:

<?php
$old_article = file_get_contents('./old_article.txt');
$new_article = $_POST['article'];

$diff = xdiff_string_diff($old_article, $new_article, 1);
if (is_string($diff)) {
    echo "Differences between two articles:\n";
    echo $diff;
}

#7


4  

This is the best one I've found.

这是我找到的最好的。

http://code.stephenmorley.org/php/diff-implementation/

http://code.stephenmorley.org/php/diff-implementation/

在PHP中突出显示两个字符串之间的差异

#8


3  

What you are looking for is a "diff algorithm". A quick google search led me to this solution. I did not test it, but maybe it will do what you need.

你正在寻找的是一个“diff算法”。快速搜索谷歌,我找到了这个解决方案。我没有测试它,但是也许它可以满足你的需要。

#9


3  

I had terrible trouble with the both the PEAR-based and the simpler alternatives shown. So here's a solution that leverages the Unix diff command (obviously, you have to be on a Unix system or have a working Windows diff command for it to work). Choose your favourite temporary directory, and change the exceptions to return codes if you prefer.

我在基于珍珠的和显示的更简单的替代品上都遇到了严重的麻烦。因此,这里有一个利用Unix diff命令的解决方案(显然,您必须使用Unix系统或Windows diff命令才能工作)。选择您最喜欢的临时目录,并更改异常以返回代码。

/**
 * @brief Find the difference between two strings, lines assumed to be separated by "\n|
 * @param $new string The new string
 * @param $old string The old string
 * @return string Human-readable output as produced by the Unix diff command,
 * or "No changes" if the strings are the same.
 * @throws Exception
 */
public static function diff($new, $old) {
  $tempdir = '/var/somewhere/tmp'; // Your favourite temporary directory
  $oldfile = tempnam($tempdir,'OLD');
  $newfile = tempnam($tempdir,'NEW');
  if (!@file_put_contents($oldfile,$old)) {
    throw new Exception('diff failed to write temporary file: ' . 
         print_r(error_get_last(),true));
  }
  if (!@file_put_contents($newfile,$new)) {
    throw new Exception('diff failed to write temporary file: ' . 
         print_r(error_get_last(),true));
  }
  $answer = array();
  $cmd = "diff $newfile $oldfile";
  exec($cmd, $answer, $retcode);
  unlink($newfile);
  unlink($oldfile);
  if ($retcode != 1) {
    throw new Exception('diff failed with return code ' . $retcode);
  }
  if (empty($answer)) {
    return 'No changes';
  } else {
    return implode("\n", $answer);
  }
}

#10


2  

A php port of Neil Frasers diff_match_patch (Apache 2.0 licensed)

Neil Frasers diff_match_patch的php端口(Apache 2.0许可)

#11


1  

I would recommend looking at these awesome functions from PHP core:

我建议你看看PHP核心的这些很棒的函数:

similar_text — Calculate the similarity between two strings

similar_text -计算两个字符串之间的相似度

http://www.php.net/manual/en/function.similar-text.php

http://www.php.net/manual/en/function.similar-text.php

levenshtein — Calculate Levenshtein distance between two strings

levenshtein -计算两个字符串之间的levenshtein距离

http://www.php.net/manual/en/function.levenshtein.php

http://www.php.net/manual/en/function.levenshtein.php

soundex — Calculate the soundex key of a string

soundex——计算一个字符串的soundex键

http://www.php.net/manual/en/function.soundex.php

http://www.php.net/manual/en/function.soundex.php

metaphone — Calculate the metaphone key of a string

变音——计算字符串的变音键

http://www.php.net/manual/en/function.metaphone.php

http://www.php.net/manual/en/function.metaphone.php

#12


0  

I came across this PHP diff class by Chris Boulton based on Python difflib which could be a good solution:

我遇到了Chris Boulton基于Python difflib的PHP diff类,这是一个很好的解决方案:

PHP Diff Lib

PHP Diff*

#1


38  

You can use the PHP Horde_Text_Diff package. It suits your needs, and is quite customisable as well.

您可以使用PHP Horde_Text_Diff包。它适合您的需要,而且是可以定制的。

It's also licensed under the GPL, so Enjoy!

它也是在GPL下授权的,所以享受吧!

#2


70  

Just wrote a class to compute smallest (not to be taken literally) number of edits to transform one string into another string:

只是写了一个类来计算最小的编辑数(不是按字面意思)将一个字符串转换成另一个字符串:

http://www.raymondhill.net/finediff/

http://www.raymondhill.net/finediff/

It has a static function to render a HTML version of the diff.

它有一个静态函数来呈现diff的HTML版本。

It's a first version, and likely to be improved, but it works just fine as of now, so I am throwing it out there in case someone needs to generate a compact diff efficiently, like I needed.

这是第一个版本,很可能会得到改进,但是到目前为止它还可以工作,所以我把它扔在那里,以防有人需要高效地生成一个紧凑的diff,就像我需要的那样。

Edit: It's on Github now: https://github.com/gorhill/PHP-FineDiff

编辑:现在在Github上:https://github.com/gorhill/PHP-FineDiff。

#3


25  

If you want a robust library, Text_Diff (a PEAR package) looks to be pretty good. It has some pretty cool features.

如果您想要一个健壮的库,Text_Diff (PEAR包)看起来相当不错。它有一些很酷的特点。

#4


23  

This is a nice one, also http://paulbutler.org/archives/a-simple-diff-algorithm-in-php/

这是一个很好的例子,http://paulbutler.org/archives/a-simple-diff-algorithm-in-php/

Solving the problem is not as simple as it seems, and the problem bothered me for about a year before I figured it out. I managed to write my algorithm in PHP, in 18 lines of code. It is not the most efficient way to do a diff, but it is probably the easiest to understand.

解决问题并不像看上去的那么简单,这个问题困扰了我大约一年,直到我发现它。我用PHP编写了算法,用了18行代码。这不是最有效的方法,但它可能是最容易理解的。

It works by finding the longest sequence of words common to both strings, and recursively finding the longest sequences of the remainders of the string until the substrings have no words in common. At this point it adds the remaining new words as an insertion and the remaining old words as a deletion.

它通过查找两个字符串共同的最长的单词序列来工作,并递归地查找字符串的剩余单词的最长序列,直到子字符串没有共同的单词为止。此时,它将剩下的新单词作为插入,其余的旧单词作为删除。

You can download the source here: PHP SimpleDiff...

您可以在这里下载源代码:PHP SimpleDiff…

#5


9  

Here is a short function you can use to diff two arrays. It implements the LCS algorithm:

这里有一个简短的函数可以用来分割两个数组。实现了LCS算法:

function computeDiff($from, $to)
{
    $diffValues = array();
    $diffMask = array();

    $dm = array();
    $n1 = count($from);
    $n2 = count($to);

    for ($j = -1; $j < $n2; $j++) $dm[-1][$j] = 0;
    for ($i = -1; $i < $n1; $i++) $dm[$i][-1] = 0;
    for ($i = 0; $i < $n1; $i++)
    {
        for ($j = 0; $j < $n2; $j++)
        {
            if ($from[$i] == $to[$j])
            {
                $ad = $dm[$i - 1][$j - 1];
                $dm[$i][$j] = $ad + 1;
            }
            else
            {
                $a1 = $dm[$i - 1][$j];
                $a2 = $dm[$i][$j - 1];
                $dm[$i][$j] = max($a1, $a2);
            }
        }
    }

    $i = $n1 - 1;
    $j = $n2 - 1;
    while (($i > -1) || ($j > -1))
    {
        if ($j > -1)
        {
            if ($dm[$i][$j - 1] == $dm[$i][$j])
            {
                $diffValues[] = $to[$j];
                $diffMask[] = 1;
                $j--;  
                continue;              
            }
        }
        if ($i > -1)
        {
            if ($dm[$i - 1][$j] == $dm[$i][$j])
            {
                $diffValues[] = $from[$i];
                $diffMask[] = -1;
                $i--;
                continue;              
            }
        }
        {
            $diffValues[] = $from[$i];
            $diffMask[] = 0;
            $i--;
            $j--;
        }
    }    

    $diffValues = array_reverse($diffValues);
    $diffMask = array_reverse($diffMask);

    return array('values' => $diffValues, 'mask' => $diffMask);
}

It generates two arrays:

它生成两个数组:

  • values array: a list of elements as they appear in the diff.
  • 值数组:元素在diff中出现时的列表。
  • mask array: contains numbers. 0: unchanged, -1: removed, 1: added.
  • 面具数组:包含数字。0:不变,-1:删除,1:添加。

If you populate an array with characters, it can be used to compute inline difference. Now just a single step to highlight the differences:

如果用字符填充数组,可以使用它来计算行内差异。现在只需一步就能突出差异:

function diffline($line1, $line2)
{
    $diff = computeDiff(str_split($line1), str_split($line2));
    $diffval = $diff['values'];
    $diffmask = $diff['mask'];

    $n = count($diffval);
    $pmc = 0;
    $result = '';
    for ($i = 0; $i < $n; $i++)
    {
        $mc = $diffmask[$i];
        if ($mc != $pmc)
        {
            switch ($pmc)
            {
                case -1: $result .= '</del>'; break;
                case 1: $result .= '</ins>'; break;
            }
            switch ($mc)
            {
                case -1: $result .= '<del>'; break;
                case 1: $result .= '<ins>'; break;
            }
        }
        $result .= $diffval[$i];

        $pmc = $mc;
    }
    switch ($pmc)
    {
        case -1: $result .= '</del>'; break;
        case 1: $result .= '</ins>'; break;
    }

    return $result;
}

Eg.:

如:

echo diffline('*', 'ServerFault')

Will output:

将输出:

S<del>tackO</del><ins>er</ins>ver<del>f</del><ins>Fau</ins>l<del>ow</del><ins>t</ins> 

StackOerverfFaulowt

年代tackOerver fFaul owt

Additional notes:

其他说明:

  • The diff matrix requires (m+1)*(n+1) elements. So you can run into out of memory errors if you try to diff long sequences. In this case diff larger chunks (eg. lines) first, then diff their contents in a second pass.
  • diff矩阵需要(m+1)*(n+1)元素。因此,如果你试图分割长序列,你可能会遇到内存错误。在这种情况下,分割更大的块。行)首先,然后在第二遍中修改内容。
  • The algorithm can be improved if you trim the matching elements from the beginning and the end, then run the algorithm on the differing middle only. A latter (more bloated) version contains these modifications too.
  • 如果从开始到结束都对匹配元素进行修剪,然后只在不同的中间运行算法,则可以对算法进行改进。后一个(更膨胀的)版本也包含这些修改。

#6


6  

There is also a PECL extension for xdiff:

xdiff还有一个PECL扩展:

In particular:

特别是:

  • xdiff_string_diff — Make unified diff of two strings
  • xdiff_string_diff -创建两个字符串的统一diff

Example from PHP Manual:

PHP手册的例子:

<?php
$old_article = file_get_contents('./old_article.txt');
$new_article = $_POST['article'];

$diff = xdiff_string_diff($old_article, $new_article, 1);
if (is_string($diff)) {
    echo "Differences between two articles:\n";
    echo $diff;
}

#7


4  

This is the best one I've found.

这是我找到的最好的。

http://code.stephenmorley.org/php/diff-implementation/

http://code.stephenmorley.org/php/diff-implementation/

在PHP中突出显示两个字符串之间的差异

#8


3  

What you are looking for is a "diff algorithm". A quick google search led me to this solution. I did not test it, but maybe it will do what you need.

你正在寻找的是一个“diff算法”。快速搜索谷歌,我找到了这个解决方案。我没有测试它,但是也许它可以满足你的需要。

#9


3  

I had terrible trouble with the both the PEAR-based and the simpler alternatives shown. So here's a solution that leverages the Unix diff command (obviously, you have to be on a Unix system or have a working Windows diff command for it to work). Choose your favourite temporary directory, and change the exceptions to return codes if you prefer.

我在基于珍珠的和显示的更简单的替代品上都遇到了严重的麻烦。因此,这里有一个利用Unix diff命令的解决方案(显然,您必须使用Unix系统或Windows diff命令才能工作)。选择您最喜欢的临时目录,并更改异常以返回代码。

/**
 * @brief Find the difference between two strings, lines assumed to be separated by "\n|
 * @param $new string The new string
 * @param $old string The old string
 * @return string Human-readable output as produced by the Unix diff command,
 * or "No changes" if the strings are the same.
 * @throws Exception
 */
public static function diff($new, $old) {
  $tempdir = '/var/somewhere/tmp'; // Your favourite temporary directory
  $oldfile = tempnam($tempdir,'OLD');
  $newfile = tempnam($tempdir,'NEW');
  if (!@file_put_contents($oldfile,$old)) {
    throw new Exception('diff failed to write temporary file: ' . 
         print_r(error_get_last(),true));
  }
  if (!@file_put_contents($newfile,$new)) {
    throw new Exception('diff failed to write temporary file: ' . 
         print_r(error_get_last(),true));
  }
  $answer = array();
  $cmd = "diff $newfile $oldfile";
  exec($cmd, $answer, $retcode);
  unlink($newfile);
  unlink($oldfile);
  if ($retcode != 1) {
    throw new Exception('diff failed with return code ' . $retcode);
  }
  if (empty($answer)) {
    return 'No changes';
  } else {
    return implode("\n", $answer);
  }
}

#10


2  

A php port of Neil Frasers diff_match_patch (Apache 2.0 licensed)

Neil Frasers diff_match_patch的php端口(Apache 2.0许可)

#11


1  

I would recommend looking at these awesome functions from PHP core:

我建议你看看PHP核心的这些很棒的函数:

similar_text — Calculate the similarity between two strings

similar_text -计算两个字符串之间的相似度

http://www.php.net/manual/en/function.similar-text.php

http://www.php.net/manual/en/function.similar-text.php

levenshtein — Calculate Levenshtein distance between two strings

levenshtein -计算两个字符串之间的levenshtein距离

http://www.php.net/manual/en/function.levenshtein.php

http://www.php.net/manual/en/function.levenshtein.php

soundex — Calculate the soundex key of a string

soundex——计算一个字符串的soundex键

http://www.php.net/manual/en/function.soundex.php

http://www.php.net/manual/en/function.soundex.php

metaphone — Calculate the metaphone key of a string

变音——计算字符串的变音键

http://www.php.net/manual/en/function.metaphone.php

http://www.php.net/manual/en/function.metaphone.php

#12


0  

I came across this PHP diff class by Chris Boulton based on Python difflib which could be a good solution:

我遇到了Chris Boulton基于Python difflib的PHP diff类,这是一个很好的解决方案:

PHP Diff Lib

PHP Diff*