基于模式的SQL Server字符串提取

时间:2022-09-13 13:22:18

I have string data in the following format:

我有以下格式的字符串数据:

MODELNUMBER=S15229&PRICODE=WY554&GADTYPE=PLA&ID=S-15229
/DTYPE=PLA&ID=S-10758&UN_JTT_REDIRECT=UN_JTT_IOSV

and need to extract IDs based on two conditions

并且需要基于两个条件来提取id。

  1. Starting after a pattern &ID=
  2. 从模式&ID=开始
  3. Ending till the last character or

    直到最后一个字符为止

  4. if it hits a & stop right there.

    如果它碰到a,停在这里。

So in the above example I'm using the following code:

在上面的例子中,我使用了以下代码:

SUBSTRING(MyCol,(PATINDEX('%&id=%',[MyCol])+4),(LEN(MyCol) - PATINDEX('%&id%',[MyCol])))

SUBSTRING(MyCol(PATINDEX(“% id = %”,[MyCol])+ 4),(LEN(MyCol)——PATINDEX(“% id %”,[MyCol])))

Essentially looking the pattern &id=% and extract string after that till end of the line. Would anyone advise on how to handle the later part of the logic ..

从本质上看模式&id=%,然后提取字符串直到行尾。对于后面的逻辑部分,有人提出建议吗?

My current results are

我现在的结果

S-15229
S-10758&UN_JTT_REDIRECT=UN_JTT_IOSV

What I need is

我需要的是

S-15229
S-10758

2 个解决方案

#1


3  

Try this

试试这个

SUBSTRING(MyCol, (PATINDEX('%[A-Z]-[0-9][0-9][0-9][0-9][0-9]%',[MyCol])),7) 

if you run into performance issues add the where clause below

如果您遇到性能问题,请添加下面的where子句

-- from Mytable
WHERE [MyCol] like '%[A-Z]-[0-9][0-9][0-9][0-9][0-9]%'

maybe not the most elegant solution but it works for me.

也许这不是最优雅的解决方案,但对我来说是可行的。

Correct syntax of PATINDEX

正确的语法PATINDEX

#2


2  

Here's one example how to do it:

这里有一个例子:

select
    substring(d.data, s.s, isnull(nullif(e.e,0),2000)-s.s) as ID, 
    d.data 
from data d
cross apply (
    select charindex('&ID=', d.data)+4 as s
) s
cross apply (
    select charindex('&', d.data, s) as e
) e
where s.s > 4

This assumes there data column is varchar(2000) and the where clause leaves out any rows that don't have &ID=

这假设数据列是varchar(2000), where子句省略了没有&ID=的任何行

The first cross apply searches for the start position, the second one for the end. The isnull+nulliff in the actual select handles the case where & is not found and replaces it with 2000 to make sure the whole string is returned.

第一个交叉应用搜索开始位置,第二个交叉应用搜索结束。实际select中的isnull+nulliff处理未找到&的情况,并将其替换为2000,以确保返回整个字符串。

#1


3  

Try this

试试这个

SUBSTRING(MyCol, (PATINDEX('%[A-Z]-[0-9][0-9][0-9][0-9][0-9]%',[MyCol])),7) 

if you run into performance issues add the where clause below

如果您遇到性能问题,请添加下面的where子句

-- from Mytable
WHERE [MyCol] like '%[A-Z]-[0-9][0-9][0-9][0-9][0-9]%'

maybe not the most elegant solution but it works for me.

也许这不是最优雅的解决方案,但对我来说是可行的。

Correct syntax of PATINDEX

正确的语法PATINDEX

#2


2  

Here's one example how to do it:

这里有一个例子:

select
    substring(d.data, s.s, isnull(nullif(e.e,0),2000)-s.s) as ID, 
    d.data 
from data d
cross apply (
    select charindex('&ID=', d.data)+4 as s
) s
cross apply (
    select charindex('&', d.data, s) as e
) e
where s.s > 4

This assumes there data column is varchar(2000) and the where clause leaves out any rows that don't have &ID=

这假设数据列是varchar(2000), where子句省略了没有&ID=的任何行

The first cross apply searches for the start position, the second one for the end. The isnull+nulliff in the actual select handles the case where & is not found and replaces it with 2000 to make sure the whole string is returned.

第一个交叉应用搜索开始位置,第二个交叉应用搜索结束。实际select中的isnull+nulliff处理未找到&的情况,并将其替换为2000,以确保返回整个字符串。