如何从SQL Server中的单个行提取多个字符串

时间:2022-04-16 21:22:21

I have e.g. the following table data:

我有下列表格资料:

id    |    text
--------------------------------------------------------------------------------
1     |  Peter (Peter@peter.de) and Marta (marty@gmail.com) are doing fine.
2     |  Nothing special here
3     |  Another email address (me@my.com)

Now I need a select that returns all email addresses from my text columns (its okay to just check for the parentheses), and that returns more than one row if there are multiple addresses in the text column. I know how to extract the first element, but am totally clueless about how to find the second and more results.

现在我需要一个select来返回我的文本列中的所有电子邮件地址(只需检查括号即可),如果文本列中有多个地址,那么它将返回多个行。我知道如何提取第一个元素,但我完全不知道如何找到第二个和更多的结果。

3 个解决方案

#1


5  

You can use a cte recursively to strip out the strings.

您可以递归地使用cte来删除字符串。

declare @T table (id int, [text] nvarchar(max))

insert into @T values (1, 'Peter (Peter@peter.de) and Marta (marty@gmail.com) are doing fine.')
insert into @T values (2, 'Nothing special here')
insert into @T values (3, 'Another email address (me@my.com)')

;with cte([text], email)
as
(
    select
        right([text], len([text]) - charindex(')', [text], 0)),
        substring([text], charindex('(', [text], 0) + 1, charindex(')', [text], 0) - charindex('(', [text], 0) - 1) 
    from @T
    where charindex('(', [text], 0) > 0
    union all
    select
        right([text], len([text]) - charindex(')', [text], 0)),
        substring([text], charindex('(', [text], 0) + 1, charindex(')', [text], 0) - charindex('(', [text], 0) - 1) 
    from cte
    where charindex('(', [text], 0) > 0
)
select email
from cte

Result

结果

email
Peter@peter.de
me@my.com
marty@gmail.com

#2


2  

This assumes there are no rogue parentheses and you would need to add some additional replaces in if your text can contain any XML entity characters.

这假定没有恶意括号,如果文本可以包含任何XML实体字符,则需要添加一些额外的替换。

WITH basedata(id, [text])
     AS (SELECT 1, 'Peter (Peter@peter.de) and Marta (marty@gmail.com) are doing fine.'
         UNION ALL
         SELECT 2, 'Nothing special here'
         UNION ALL
         SELECT 3, 'Another email address (me@my.com)'),
     cte(id, t, x)
     AS (SELECT *,
                CAST('<foo>' + REPLACE(REPLACE([text],'(','<bar>'),')','</bar>') + '</foo>' AS XML)
         FROM   basedata)
SELECT id,
       a.value('.', 'nvarchar(max)') as address
FROM   cte
       CROSS APPLY x.nodes('//foo/bar') as addresses(a) 

#3


-1  

THe substring functions have starting position parameter. So you find the first occurrence,and start the next search (in your loop) at the occurrence position + occurenceLength. You'd need to write a function that returns the values either as a delimited string or table. Use the @-sign to find your way into the email address, and then scan backwards and forwards until you reach white space or a character that's invalid in an email address (or the start-pos or the beginning or the last char).

子字符串函数具有起始位置参数。因此找到第一个事件,并在出现位置+ occurrence encelength处启动下一个搜索(在循环中)。您需要编写一个函数,该函数以带分隔符的字符串或表的形式返回值。使用@-sign找到进入电子邮件地址的方法,然后向后和向前扫描,直到您到达电子邮件地址中的空格或无效字符(或start-pos、开头或最后一个字符)。

#1


5  

You can use a cte recursively to strip out the strings.

您可以递归地使用cte来删除字符串。

declare @T table (id int, [text] nvarchar(max))

insert into @T values (1, 'Peter (Peter@peter.de) and Marta (marty@gmail.com) are doing fine.')
insert into @T values (2, 'Nothing special here')
insert into @T values (3, 'Another email address (me@my.com)')

;with cte([text], email)
as
(
    select
        right([text], len([text]) - charindex(')', [text], 0)),
        substring([text], charindex('(', [text], 0) + 1, charindex(')', [text], 0) - charindex('(', [text], 0) - 1) 
    from @T
    where charindex('(', [text], 0) > 0
    union all
    select
        right([text], len([text]) - charindex(')', [text], 0)),
        substring([text], charindex('(', [text], 0) + 1, charindex(')', [text], 0) - charindex('(', [text], 0) - 1) 
    from cte
    where charindex('(', [text], 0) > 0
)
select email
from cte

Result

结果

email
Peter@peter.de
me@my.com
marty@gmail.com

#2


2  

This assumes there are no rogue parentheses and you would need to add some additional replaces in if your text can contain any XML entity characters.

这假定没有恶意括号,如果文本可以包含任何XML实体字符,则需要添加一些额外的替换。

WITH basedata(id, [text])
     AS (SELECT 1, 'Peter (Peter@peter.de) and Marta (marty@gmail.com) are doing fine.'
         UNION ALL
         SELECT 2, 'Nothing special here'
         UNION ALL
         SELECT 3, 'Another email address (me@my.com)'),
     cte(id, t, x)
     AS (SELECT *,
                CAST('<foo>' + REPLACE(REPLACE([text],'(','<bar>'),')','</bar>') + '</foo>' AS XML)
         FROM   basedata)
SELECT id,
       a.value('.', 'nvarchar(max)') as address
FROM   cte
       CROSS APPLY x.nodes('//foo/bar') as addresses(a) 

#3


-1  

THe substring functions have starting position parameter. So you find the first occurrence,and start the next search (in your loop) at the occurrence position + occurenceLength. You'd need to write a function that returns the values either as a delimited string or table. Use the @-sign to find your way into the email address, and then scan backwards and forwards until you reach white space or a character that's invalid in an email address (or the start-pos or the beginning or the last char).

子字符串函数具有起始位置参数。因此找到第一个事件,并在出现位置+ occurrence encelength处启动下一个搜索(在循环中)。您需要编写一个函数,该函数以带分隔符的字符串或表的形式返回值。使用@-sign找到进入电子邮件地址的方法,然后向后和向前扫描,直到您到达电子邮件地址中的空格或无效字符(或start-pos、开头或最后一个字符)。