LEN函数不包括SQL Server中的尾随空格。

时间:2022-12-03 00:14:02

I have the following test table in SQL Server 2005:

SQL Server 2005中的测试表如下:

CREATE TABLE [dbo].[TestTable]
(
 [ID] [int] NOT NULL,
 [TestField] [varchar](100) NOT NULL
) 

Populated with:

填充:

INSERT INTO TestTable (ID, TestField) VALUES (1, 'A value');   -- Len = 7
INSERT INTO TestTable (ID, TestField) VALUES (2, 'Another value      '); -- Len = 13 + 6 spaces

When I try to find the length of TestField with the SQL Server LEN() function it does not count the trailing spaces - e.g.:

当我尝试使用SQL Server LEN()函数查找TestField的长度时,它不计算末尾的空格——例如:

-- Note: Also results the grid view of TestField do not show trailing spaces (SQL Server 2005).
SELECT 
 ID, 
 TestField, 
 LEN(TestField) As LenOfTestField, -- Does not include trailing spaces
FROM 
 TestTable

How do I include the trailing spaces in the length result?

如何在长度结果中包含尾距?

10 个解决方案

#1


103  

This is clearly documented by Microsoft in MSDN at http://msdn.microsoft.com/en-us/library/ms190329(SQL.90).aspx, which states LEN "returns the number of characters of the specified string expression, excluding trailing blanks". It is, however, an easy detail on to miss if you're not wary.

这是微软在MSDN上清楚地记录的:http://msdn.microsoft.com/en-us/library/ms190329(SQL.90).aspx,它表示LEN“返回指定字符串表达式的字符数,不包括后面的空格”。然而,如果你不小心的话,这是一个很容易忽略的细节。

You need to instead use the DATALENGTH function - see http://msdn.microsoft.com/en-us/library/ms173486(SQL.90).aspx - which "returns the number of bytes used to represent any expression".

您需要使用DATALENGTH函数—参见http://msdn.microsoft.com/en-us/library/ms173486(SQL.90).aspx—它“返回用于表示任何表达式的字节数”。

Example:

例子:

SELECT 
    ID, 
    TestField, 
    LEN(TestField) As LenOfTestField,           -- Does not include trailing spaces
    DATALENGTH(TestField) As DataLengthOfTestField      -- Shows the true length of data, including trailing spaces.
FROM 
    TestTable

#2


69  

You can use this trick:

你可以使用这个技巧:

LEN(Str + 'x') - 1

LEN(Str + 'x') - 1

#3


9  

I use this method:

我用这个方法:

LEN(REPLACE(TestField, ' ', '.'))

I prefer this over DATALENGTH because this works with different data types, and I prefer it over adding a character to the end because you don't have to worry about the edge case where your string is already at the max length.

比起DATALENGTH,我更喜欢这个,因为它适用于不同的数据类型,我更喜欢它,而不是在末尾添加一个字符,因为您不必担心字符串已经在最大长度上的边界情况。

Note: I would test the performance before using it against a very large data set; though I just tested it against 2M rows and it was no slower than LEN without the REPLACE...

注意:我将在对一个非常大的数据集使用它之前测试性能;虽然我刚刚对它进行了2M行测试,没有替换它并不比LEN慢……

#4


8  

"How do I include the trailing spaces in the length result?"

“如何在长度结果中包含尾距?”

You get someone to file a SQL Server enhancement request/bug report because nearly all the listed workarounds to this amazingly simple issue here have some deficiency or are inefficient. This still appears to be true in SQL Server 2012. The auto trimming feature may stem from ANSI/ISO SQL-92 but there seems to be some holes (or lack of counting them).

您可以让某人提交一个SQL Server增强请求/bug报告,因为几乎所有列出的解决方案都有一些不足或效率低下。这在SQL Server 2012中似乎仍然成立。自动修剪功能可能来自ANSI/ISO SQL-92,但似乎存在一些漏洞(或缺少计算)。

Please vote it up: https://connect.microsoft.com/SQLServer/feedback/details/801381

请投票:https://connect.microsoft.com/SQLServer/feedback/details/801381

#5


8  

There are problems with the two top voted answers. The answer recommending DATALENGTH is prone to programmer errors. The result of DATALENGTH must be divided by the 2 for NVARCHAR types, but not for VARCHAR types. This requires knowledge of the type you're getting the length of, and if that type changes, you have to diligently change the places you used DATALENGTH.

这两项最高票数的答案都存在问题。推荐DATALENGTH的答案很容易出现程序员错误。对于NVARCHAR类型,必须将DATALENGTH的结果除以2,而对于VARCHAR类型则不能。这需要了解您正在获取的类型的长度,如果类型发生了更改,您必须努力更改您使用DATALENGTH的位置。

There is also a problem with the most upvoted answer (which I admit was my preferred way to do it until this problem bit me). If the thing you are getting the length of is of type NVARCHAR(4000), and it actually contains a string of 4000 characters, SQL will ignore the appended character rather than implicitly cast the result to NVARCHAR(MAX). The end result is an incorrect length. The same thing will happen with VARCHAR(8000).

还有一个问题是投票最多的答案(我承认这是我最喜欢的方法,直到这个问题把我难住)。如果得到的长度是NVARCHAR(4000)类型,并且它实际上包含4000个字符的字符串,那么SQL将忽略附加字符,而不是隐式地将结果转换为NVARCHAR(MAX)。最终结果是不正确的长度。VARCHAR(8000)也会发生同样的事情。

What I've found works, is nearly as fast as plain old LEN, is faster than LEN(@s + 'x') - 1 for large strings, and does not assume the underlying character width is the following:

我所发现的工作,几乎和普通的老LEN一样快,比LEN(@s + 'x') - 1对于大字符串,并且不假定底层的字符宽度是:

DATALENGTH(@s) / DATALENGTH(LEFT(LEFT(@s, 1) + 'x', 1))

This gets the datalength, and then divides by the datalength of a single character from the string. The append of 'x' covers the case where the string is empty (which would give a divide by zero in that case). This works whether @s is VARCHAR or NVARCHAR. Doing the LEFT of 1 character before the append shaves some time when the string is large. The problem with this though, is that it does not work correctly with strings containing surrogate pairs.

它获取datalength,然后除以字符串中单个字符的datalength。“x”的附加部分涵盖了字符串为空的情况(在这种情况下,这将得到除以0)。这适用于@s是VARCHAR还是NVARCHAR。在附加字符之前执行1个字符的左操作,当字符串较大时将在一定时间内进行剪切。但问题是,它不能正确地处理包含代理对的字符串。

There is another way mentioned in a comment to the accepted answer, using REPLACE(@s,' ','x'). That technique gives the correct answer, but is a couple orders of magnitude slower than the other techniques when the string is large.

对于已接受的答案,在注释中有另一种方法,使用REPLACE(@s,' ','x')。该技术给出了正确的答案,但是当字符串较大时,它比其他技术慢几个数量级。

Given the problems introduced by surrogate pairs on any technique that uses DATALENGTH, I think the safest method that gives correct answers that I know of is the following:

考虑到代理对使用DATALENGTH的任何技术所引入的问题,我认为给出正确答案的最安全的方法是:

LEN(CONVERT(NVARCHAR(MAX), @s) + 'x') - 1

This is faster than the REPLACE technique, and much faster with longer strings. Basically this technique is the LEN(@s + 'x') - 1 technique, but with protection for the edge case where the string has a length of 4000 (for nvarchar) or 8000 (for varchar), so that the correct answer is given even for that. It also should handle strings with surrogate pairs correctly.

这比替换技术要快,而长字符串要快得多。基本上,这种技术是LEN(@s + 'x') - 1技术,但是对边缘情况有保护,其中字符串长度为4000 (nvarchar)或8000 (varchar),因此即使这样,也会给出正确的答案。它还应该正确地使用代理对处理字符串。

#6


5  

You need also to ensure that your data is actually saved with the trailing blanks. When ANSI PADDING is OFF (non-default):

您还需要确保您的数据实际上是用尾随空格保存的。当ANSI填充为OFF时(非默认):

Trailing blanks in character values inserted into a varchar column are trimmed.

插入到varchar列中的字符值中的尾部空格被删除。

#7


4  

LEN cuts trailing spaces by default, so I found this worked as you move them to the front

LEN在默认情况下减少了尾随空格,所以我发现当你把它们移到前面时,这是有效的。

(LEN(REVERSE(TestField))

(LEN(反向(TestField))

So if you wanted to, you could say

如果你想,你可以说

SELECT
t.TestField,
LEN(REVERSE(t.TestField)) AS [Reverse],
LEN(t.TestField) AS [Count]
FROM TestTable t
WHERE LEN(REVERSE(t.TestField)) <> LEN(t.TestField)

Don't use this for leading spaces of course.

当然,不要用这个来引导空间。

#8


1  

You should define a CLR function that returns the String's Length field, if you dislike string concatination. I use LEN('x' + @string + 'x') - 2 in my production use-cases.

如果不喜欢字符串concatination,应该定义一个CLR函数来返回字符串的长度字段。我在产品用例中使用LEN('x' + @string + 'x') - 2。

#9


0  

If you dislike the DATALENGTH because of of n/varchar concerns, how about:

如果因为n/varchar问题而不喜欢DATALENGTH,那么:

select DATALENGTH(@var)/isnull(nullif(DATALENGTH(left(@var,1)),0),1)

which is just

也就是

select DATALENGTH(@var)/DATALENGTH(left(@var,1))

wrapped with divide-by-zero protection.

用除保护。

By dividing by the DATALENGTH of a single char, we get the length normalised.

通过除以单个字符的DATALENGTH,我们可以使长度正常化。

(Of course, still issues with surrogate-pairs if that's a concern.)

(当然,如果需要考虑的话,还是要考虑*。)

#10


-2  

use SELECT DATALENGTH('string ')

使用选择DATALENGTH(字符串)

#1


103  

This is clearly documented by Microsoft in MSDN at http://msdn.microsoft.com/en-us/library/ms190329(SQL.90).aspx, which states LEN "returns the number of characters of the specified string expression, excluding trailing blanks". It is, however, an easy detail on to miss if you're not wary.

这是微软在MSDN上清楚地记录的:http://msdn.microsoft.com/en-us/library/ms190329(SQL.90).aspx,它表示LEN“返回指定字符串表达式的字符数,不包括后面的空格”。然而,如果你不小心的话,这是一个很容易忽略的细节。

You need to instead use the DATALENGTH function - see http://msdn.microsoft.com/en-us/library/ms173486(SQL.90).aspx - which "returns the number of bytes used to represent any expression".

您需要使用DATALENGTH函数—参见http://msdn.microsoft.com/en-us/library/ms173486(SQL.90).aspx—它“返回用于表示任何表达式的字节数”。

Example:

例子:

SELECT 
    ID, 
    TestField, 
    LEN(TestField) As LenOfTestField,           -- Does not include trailing spaces
    DATALENGTH(TestField) As DataLengthOfTestField      -- Shows the true length of data, including trailing spaces.
FROM 
    TestTable

#2


69  

You can use this trick:

你可以使用这个技巧:

LEN(Str + 'x') - 1

LEN(Str + 'x') - 1

#3


9  

I use this method:

我用这个方法:

LEN(REPLACE(TestField, ' ', '.'))

I prefer this over DATALENGTH because this works with different data types, and I prefer it over adding a character to the end because you don't have to worry about the edge case where your string is already at the max length.

比起DATALENGTH,我更喜欢这个,因为它适用于不同的数据类型,我更喜欢它,而不是在末尾添加一个字符,因为您不必担心字符串已经在最大长度上的边界情况。

Note: I would test the performance before using it against a very large data set; though I just tested it against 2M rows and it was no slower than LEN without the REPLACE...

注意:我将在对一个非常大的数据集使用它之前测试性能;虽然我刚刚对它进行了2M行测试,没有替换它并不比LEN慢……

#4


8  

"How do I include the trailing spaces in the length result?"

“如何在长度结果中包含尾距?”

You get someone to file a SQL Server enhancement request/bug report because nearly all the listed workarounds to this amazingly simple issue here have some deficiency or are inefficient. This still appears to be true in SQL Server 2012. The auto trimming feature may stem from ANSI/ISO SQL-92 but there seems to be some holes (or lack of counting them).

您可以让某人提交一个SQL Server增强请求/bug报告,因为几乎所有列出的解决方案都有一些不足或效率低下。这在SQL Server 2012中似乎仍然成立。自动修剪功能可能来自ANSI/ISO SQL-92,但似乎存在一些漏洞(或缺少计算)。

Please vote it up: https://connect.microsoft.com/SQLServer/feedback/details/801381

请投票:https://connect.microsoft.com/SQLServer/feedback/details/801381

#5


8  

There are problems with the two top voted answers. The answer recommending DATALENGTH is prone to programmer errors. The result of DATALENGTH must be divided by the 2 for NVARCHAR types, but not for VARCHAR types. This requires knowledge of the type you're getting the length of, and if that type changes, you have to diligently change the places you used DATALENGTH.

这两项最高票数的答案都存在问题。推荐DATALENGTH的答案很容易出现程序员错误。对于NVARCHAR类型,必须将DATALENGTH的结果除以2,而对于VARCHAR类型则不能。这需要了解您正在获取的类型的长度,如果类型发生了更改,您必须努力更改您使用DATALENGTH的位置。

There is also a problem with the most upvoted answer (which I admit was my preferred way to do it until this problem bit me). If the thing you are getting the length of is of type NVARCHAR(4000), and it actually contains a string of 4000 characters, SQL will ignore the appended character rather than implicitly cast the result to NVARCHAR(MAX). The end result is an incorrect length. The same thing will happen with VARCHAR(8000).

还有一个问题是投票最多的答案(我承认这是我最喜欢的方法,直到这个问题把我难住)。如果得到的长度是NVARCHAR(4000)类型,并且它实际上包含4000个字符的字符串,那么SQL将忽略附加字符,而不是隐式地将结果转换为NVARCHAR(MAX)。最终结果是不正确的长度。VARCHAR(8000)也会发生同样的事情。

What I've found works, is nearly as fast as plain old LEN, is faster than LEN(@s + 'x') - 1 for large strings, and does not assume the underlying character width is the following:

我所发现的工作,几乎和普通的老LEN一样快,比LEN(@s + 'x') - 1对于大字符串,并且不假定底层的字符宽度是:

DATALENGTH(@s) / DATALENGTH(LEFT(LEFT(@s, 1) + 'x', 1))

This gets the datalength, and then divides by the datalength of a single character from the string. The append of 'x' covers the case where the string is empty (which would give a divide by zero in that case). This works whether @s is VARCHAR or NVARCHAR. Doing the LEFT of 1 character before the append shaves some time when the string is large. The problem with this though, is that it does not work correctly with strings containing surrogate pairs.

它获取datalength,然后除以字符串中单个字符的datalength。“x”的附加部分涵盖了字符串为空的情况(在这种情况下,这将得到除以0)。这适用于@s是VARCHAR还是NVARCHAR。在附加字符之前执行1个字符的左操作,当字符串较大时将在一定时间内进行剪切。但问题是,它不能正确地处理包含代理对的字符串。

There is another way mentioned in a comment to the accepted answer, using REPLACE(@s,' ','x'). That technique gives the correct answer, but is a couple orders of magnitude slower than the other techniques when the string is large.

对于已接受的答案,在注释中有另一种方法,使用REPLACE(@s,' ','x')。该技术给出了正确的答案,但是当字符串较大时,它比其他技术慢几个数量级。

Given the problems introduced by surrogate pairs on any technique that uses DATALENGTH, I think the safest method that gives correct answers that I know of is the following:

考虑到代理对使用DATALENGTH的任何技术所引入的问题,我认为给出正确答案的最安全的方法是:

LEN(CONVERT(NVARCHAR(MAX), @s) + 'x') - 1

This is faster than the REPLACE technique, and much faster with longer strings. Basically this technique is the LEN(@s + 'x') - 1 technique, but with protection for the edge case where the string has a length of 4000 (for nvarchar) or 8000 (for varchar), so that the correct answer is given even for that. It also should handle strings with surrogate pairs correctly.

这比替换技术要快,而长字符串要快得多。基本上,这种技术是LEN(@s + 'x') - 1技术,但是对边缘情况有保护,其中字符串长度为4000 (nvarchar)或8000 (varchar),因此即使这样,也会给出正确的答案。它还应该正确地使用代理对处理字符串。

#6


5  

You need also to ensure that your data is actually saved with the trailing blanks. When ANSI PADDING is OFF (non-default):

您还需要确保您的数据实际上是用尾随空格保存的。当ANSI填充为OFF时(非默认):

Trailing blanks in character values inserted into a varchar column are trimmed.

插入到varchar列中的字符值中的尾部空格被删除。

#7


4  

LEN cuts trailing spaces by default, so I found this worked as you move them to the front

LEN在默认情况下减少了尾随空格,所以我发现当你把它们移到前面时,这是有效的。

(LEN(REVERSE(TestField))

(LEN(反向(TestField))

So if you wanted to, you could say

如果你想,你可以说

SELECT
t.TestField,
LEN(REVERSE(t.TestField)) AS [Reverse],
LEN(t.TestField) AS [Count]
FROM TestTable t
WHERE LEN(REVERSE(t.TestField)) <> LEN(t.TestField)

Don't use this for leading spaces of course.

当然,不要用这个来引导空间。

#8


1  

You should define a CLR function that returns the String's Length field, if you dislike string concatination. I use LEN('x' + @string + 'x') - 2 in my production use-cases.

如果不喜欢字符串concatination,应该定义一个CLR函数来返回字符串的长度字段。我在产品用例中使用LEN('x' + @string + 'x') - 2。

#9


0  

If you dislike the DATALENGTH because of of n/varchar concerns, how about:

如果因为n/varchar问题而不喜欢DATALENGTH,那么:

select DATALENGTH(@var)/isnull(nullif(DATALENGTH(left(@var,1)),0),1)

which is just

也就是

select DATALENGTH(@var)/DATALENGTH(left(@var,1))

wrapped with divide-by-zero protection.

用除保护。

By dividing by the DATALENGTH of a single char, we get the length normalised.

通过除以单个字符的DATALENGTH,我们可以使长度正常化。

(Of course, still issues with surrogate-pairs if that's a concern.)

(当然,如果需要考虑的话,还是要考虑*。)

#10


-2  

use SELECT DATALENGTH('string ')

使用选择DATALENGTH(字符串)