如何在Excel中查找具有特定条件的字符串中的子字符串

时间:2022-09-07 00:26:46

I have record like this

我有这样的记录

A                          Result
Hello AP#12/22 Welcome     AP#12
Thanks AP#123-21           AP#123
No problem AP#111          AP#111

So as you can see i need the AP code from the string. It must not contain the - or / part.

所以你可以看到我需要字符串中的AP代码。它不得包含 - 或/ part。

Note:

AP code can be of any number of digit

AP代码可以是任意数量的数字

It can appear at the end or start

它可以出现在最后或开始

AP code can be followed by / or - or any other special symbol such as : or any other.

AP代码可以跟随/或 - 或任何其他特殊符号,例如:或任何其他符号。

So i need a generalized formula rather than checking for each special character(/, -, :) to get AP code.

所以我需要一个通用的公式而不是检查每个特殊字符(/, - ,:)来获取AP代码。

I want to achieve this without using VB.

我想在不使用VB的情况下实现这一点。

2 个解决方案

#1


3  

Probably not the most efficient solution... but here's a way without VBA: (line break added for readability)

可能不是最有效的解决方案......但这里有一种没有VBA的方法:(为了便于阅读而增加了换行符)

= "AP#"&MID(MID(A1,FIND("AP#",A1)+3,999),1,
  MAX((ISNUMBER(MID(MID(A1,FIND("AP#",A1)+3,999),{1,2,3},1)+0)+0)*{1,2,3}))

EDIT

Slightly better solution:

稍微好一点的解决方案:

= MID(A1,FIND("AP#",A1),
  MAX(ISNUMBER(MID(MID(A1,FIND("AP#",A1)+3,999),{1,2,3},1)+0)*{1,2,3})+3)

EDIT (again)

As pointed out in comment, this does not take into account something like AP#1-1. Here is the updated formula that will take this into account:

正如评论中所指出的,这并没有考虑像AP#1-1这样的东西。以下是将此考虑在内的更新公式:

= MID(A1,FIND("AP#",A1),IFERROR(MATCH(FALSE,
  ISNUMBER(MID(MID(A1,FIND("AP#",A1)+3,3),{1,2,3},1)+0),0),4)+2)

As requested, here is how this formula works. I'll break it down step by step. This is a pretty long explanation but if you just take it one step at a time, I think you should be able to understand the entire formula. I'm going to explain what is going on from the inside out.

根据要求,以下是此公式的工作原理。我会一步一步地分解它。这是一个很长的解释,但如果你只是一步一步,我认为你应该能够理解整个公式。我将从内到外解释发生了什么。

FIND("AP#",A1) returns the character index number in A1 where the first instance of AP# appears in A1.

FIND(“AP#”,A1)返回A1中的字符索引号,其中AP#的第一个实例出现在A1中。

For simplicity, I will refer to FIND("AP#",A1) as <x1> in the next step.

为简单起见,我将在下一步中将FIND(“AP#”,A1)称为

MID(A1,<x1>+3,3) returns the 3 characters in A1 that appear immediately after AP#. It only returns 3 characters because from the original problem, you said that up to 3 numbers can appear after AP#.

MID(A1, +3,3)返回在AP#之后立即出现的A1中的3个字符。它只返回3个字符,因为从原始问题来看,你说在AP#之后最多可以出现3个数字。

(Quick note: Originally I had this part of the formula as MID(A1,<x1>+3,999) but after making this explanation, I realized that 999 could be reduced to 3. 999 would still work, just that 3 is simpler and makes the formula more efficient.)

(快速注意:最初我将这部分公式作为MID(A1, +3,999),但在做出这个解释后,我意识到999可以减少到3. 999仍然可以工作,只有3更简单,使公式更有效率。)

I will refer to this value MID(A1,<x1>+3,3) as <x2> in the next step.

我将在下一步中将此值MID(A1, +3,3)称为

MID(<x2>,{1,2,3},1) essentially converts <x2> which is a string of 3 characters, to a array of 3 strings, each string 1 character long. In other words, if <x2> is (for example), "1-2", then that means MID(<x2>,{1,2,3},1) is {"1","-","2"}. It is necessary to convert a string of 3 characters to a 1x3 array of single characters in order to individually analyze each character.

MID( ,{1,2,3},1)实质上将 (一个3个字符的字符串)转换为3个字符串的数组,每个字符串长1个字符。换句话说,如果 是(例如),“1-2”,那么这意味着MID( ,{1,2,3},1)是{“1”,“ - ”,“ 2" }。有必要将3个字符的字符串转换为单个字符的1x3数组,以便单独分析每个字符。

I will refer to MID(<x2>,{1,2,3},1) as <x3> in the next step.

我将在下一步中将MID( ,{1,2,3},1)称为

<x3>+0 may seem like a simple step but there is a lot going on here. Keep in mind <x3> is still an array of strings, not numbers (even if they look like numbers). The +0 will convert all strings that look like numbers to numbers, and will convert all strings that don't look like numbers to an error value. (In this case, #VALUE!.)

+0看起来似乎只是一个简单的步骤,但这里有很多事情要做。请记住 仍然是一个字符串数组,而不是数字(即使它们看起来像数字)。 +0将所有看起来像数字的字符串转换为数字,并将所有看起来不像数字的字符串转换为错误值。 (在这种情况下,#VALUE!。)

Sticking with our same example, {"1","-","2"}+0 will equal {1,#VALUE!,2}.

坚持我们的相同例子,{“1”,“ - ”,“2”} + 0将等于{1,#VALUE!,2}。

I will refer to <x3>+0 as <x4> in the next step.

我将在下一步中将 +0称为

MATCH(FALSE,ISNUMBER(<x4>),0) returns the first index of <x4> where it is not a number. The idea here is to find the index of the first non-number, and then include everything up to that index (minus one).

MATCH(FALSE,ISNUMBER( ),0)返回 的第一个索引,其中它不是数字。这里的想法是找到第一个非数字的索引,然后包括该索引的所有内容(减1)。

Sticking with our same example, MATCH(FALSE,ISNUMBER({1,#VALUE!,2}),0) would return 2, because the 2nd index in {1,#VALUE!,2} is the first index that is not a number.

坚持我们的相同例子,MATCH(FALSE,ISNUMBER({1,#VALUE!,2}),0)将返回2,因为{1,#VALUE!,2}中的第二个索引是第一个不是一个号码。

I will refer to MATCH(FALSE,ISNUMBER(<x4>),0) as <x5> in the next step.

我将在下一步中将MATCH(FALSE,ISNUMBER( ),0)称为

It is possible that all values in <x4> are numbers, in which case <x5> would return an error because it can't find a match for a non-number. IFERROR(<x5>,4) fixes this issue. It returns the value 4 if all values in <x5> are numbers. The reason to return 4 is because we are basically saying that all 3 of the characters following AP# are numbers, so the first index that we aren't considering after AP# is the 4th index.

中的所有值都可能是数字,在这种情况下, 将返回错误,因为它找不到非数字的匹配项。 IFERROR( ,4)解决了这个问题。如果 中的所有值都是数字,则返回值4。返回4的原因是因为我们基本上说AP#之后的所有3个字符都是数字,所以我们在AP#之后没有考虑的第一个索引是第4个索引。

I will refer to IFERROR(<x5>,4) as <x6> in the next step.

我将在下一步中将IFERROR( ,4)称为

<x6>+2 may seem like a strange calculation, and it is, so I will write it a different way that will make more sense: (<x6>-1)+3

+2可能看起来像一个奇怪的计算,它是,所以我会用不同的方式写它会更有意义:( -1)+3

Remember what <x6> represents here: It is the index of the first non-number that appears in the string of 3 characters after AP# . Therefore, <x6>-1 is the number of characters to include after AP#.

记住 在这里所代表的内容:它是AP#之后3个字符串中出现的第一个非数字的索引。因此, -1是AP#之后要包含的字符数。

Now, why add 3? (<x6>-1)+3 is necessary to include the 3 characters in AP# itself. This will make sense in the next step.

现在,为什么要加3? ( -1)+3必须包含AP#本身中的3个字符。这将在下一步有意义。

I will refer to <x6>+2 as <x7> in the next step.

我将在下一步中将 +2称为

MID(A1,FIND(AP#,A1),<x7>) returns a portion of string A1, starting at the A in AP# and spanning <x7> characters. And how large is <x7>? It is however many numbers are in the AP# code, plus 3. (Again, we must add 3 to include the 3 AP# characters themselves in the calculation.)

MID(A1,FIND(AP#,A1), )返回字符串A1的一部分,从AP#中的A开始并跨越 字符。 有多大?然而,AP#代码中有许多数字加上3.(同样,我们必须添加3以在计算中包含3个AP#字符本身。)

This is the entire calculation.

这是整个计算。

Come to think of it, you may want to wrap an IFERROR around the entire calculation to take care of cases where AP# isn't found in the string, e.g. something like:

想想看,你可能想要围绕整个计算包装一个IFERROR来处理在字符串中找不到AP#的情况,例如:就像是:

= IFERROR(MID(A1,FIND("AP#",A1),IFERROR(MATCH(FALSE,
  ISNUMBER(MID(MID(A1,FIND("AP#",A1)+3,3),{1,2,3},1)+0),0),4)+2),"no match")

But really that is your call. I'm not sure if this is necessary.

但真的是你的电话。我不确定这是否有必要。

#2


2  

Consider the following User Defined Function:

请考虑以下用户定义函数:

Public Function FindAPcode(s As String) As String
    Dim L As Long, CH As String, i As Long, j As Long

    FindAPcode = ""
    L = Len(s)
    If L = 0 Then Exit Function
    j = InStr(1, s, "AP#") + 3
    If j = 3 Then Exit Function
    FindAPcode = "AP#"

    For i = j To L
        CH = Mid(s, i, 1)
        If IsNumeric(CH) Then
            FindAPcode = FindAPcode & CH
        Else
            Exit Function
        End If
     Next i
End Function

如何在Excel中查找具有特定条件的字符串中的子字符串

User Defined Functions (UDFs) are very easy to install and use:

用户定义函数(UDF)非常易于安装和使用:

  1. ALT-F11 brings up the VBE window
  2. ALT-F11调出VBE窗口

  3. ALT-I ALT-M opens a fresh module
  4. ALT-I ALT-M打开一个新模块

  5. paste the stuff in and close the VBE window
  6. 粘贴内容并关闭VBE窗口

If you save the workbook, the UDF will be saved with it. If you are using a version of Excel later then 2003, you must save the file as .xlsm rather than .xlsx

如果保存工作簿,UDF将随之保存。如果您在2003年之后使用的是Excel版本,则必须将文件另存为.xlsm而不是.xlsx

To remove the UDF:

要删除UDF:

  1. bring up the VBE window as above
  2. 如上所述调出VBE窗口

  3. clear the code out
  4. 清除代码

  5. close the VBE window
  6. 关闭VBE窗口

To use the UDF from Excel:

要从Excel使用UDF:

=myfunction(A1)

To learn more about macros in general, see:

要了解有关宏的更多信息,请参阅:

http://www.mvps.org/dmcritchie/excel/getstarted.htm

and

http://msdn.microsoft.com/en-us/library/ee814735(v=office.14).aspx

and for specifics on UDFs, see:

有关UDF的详细信息,请参阅:

http://www.cpearson.com/excel/WritingFunctionsInVBA.aspx

Macros must be enabled for this to work!

必须启用宏才能使其正常工作!

#1


3  

Probably not the most efficient solution... but here's a way without VBA: (line break added for readability)

可能不是最有效的解决方案......但这里有一种没有VBA的方法:(为了便于阅读而增加了换行符)

= "AP#"&MID(MID(A1,FIND("AP#",A1)+3,999),1,
  MAX((ISNUMBER(MID(MID(A1,FIND("AP#",A1)+3,999),{1,2,3},1)+0)+0)*{1,2,3}))

EDIT

Slightly better solution:

稍微好一点的解决方案:

= MID(A1,FIND("AP#",A1),
  MAX(ISNUMBER(MID(MID(A1,FIND("AP#",A1)+3,999),{1,2,3},1)+0)*{1,2,3})+3)

EDIT (again)

As pointed out in comment, this does not take into account something like AP#1-1. Here is the updated formula that will take this into account:

正如评论中所指出的,这并没有考虑像AP#1-1这样的东西。以下是将此考虑在内的更新公式:

= MID(A1,FIND("AP#",A1),IFERROR(MATCH(FALSE,
  ISNUMBER(MID(MID(A1,FIND("AP#",A1)+3,3),{1,2,3},1)+0),0),4)+2)

As requested, here is how this formula works. I'll break it down step by step. This is a pretty long explanation but if you just take it one step at a time, I think you should be able to understand the entire formula. I'm going to explain what is going on from the inside out.

根据要求,以下是此公式的工作原理。我会一步一步地分解它。这是一个很长的解释,但如果你只是一步一步,我认为你应该能够理解整个公式。我将从内到外解释发生了什么。

FIND("AP#",A1) returns the character index number in A1 where the first instance of AP# appears in A1.

FIND(“AP#”,A1)返回A1中的字符索引号,其中AP#的第一个实例出现在A1中。

For simplicity, I will refer to FIND("AP#",A1) as <x1> in the next step.

为简单起见,我将在下一步中将FIND(“AP#”,A1)称为

MID(A1,<x1>+3,3) returns the 3 characters in A1 that appear immediately after AP#. It only returns 3 characters because from the original problem, you said that up to 3 numbers can appear after AP#.

MID(A1, +3,3)返回在AP#之后立即出现的A1中的3个字符。它只返回3个字符,因为从原始问题来看,你说在AP#之后最多可以出现3个数字。

(Quick note: Originally I had this part of the formula as MID(A1,<x1>+3,999) but after making this explanation, I realized that 999 could be reduced to 3. 999 would still work, just that 3 is simpler and makes the formula more efficient.)

(快速注意:最初我将这部分公式作为MID(A1, +3,999),但在做出这个解释后,我意识到999可以减少到3. 999仍然可以工作,只有3更简单,使公式更有效率。)

I will refer to this value MID(A1,<x1>+3,3) as <x2> in the next step.

我将在下一步中将此值MID(A1, +3,3)称为

MID(<x2>,{1,2,3},1) essentially converts <x2> which is a string of 3 characters, to a array of 3 strings, each string 1 character long. In other words, if <x2> is (for example), "1-2", then that means MID(<x2>,{1,2,3},1) is {"1","-","2"}. It is necessary to convert a string of 3 characters to a 1x3 array of single characters in order to individually analyze each character.

MID( ,{1,2,3},1)实质上将 (一个3个字符的字符串)转换为3个字符串的数组,每个字符串长1个字符。换句话说,如果 是(例如),“1-2”,那么这意味着MID( ,{1,2,3},1)是{“1”,“ - ”,“ 2" }。有必要将3个字符的字符串转换为单个字符的1x3数组,以便单独分析每个字符。

I will refer to MID(<x2>,{1,2,3},1) as <x3> in the next step.

我将在下一步中将MID( ,{1,2,3},1)称为

<x3>+0 may seem like a simple step but there is a lot going on here. Keep in mind <x3> is still an array of strings, not numbers (even if they look like numbers). The +0 will convert all strings that look like numbers to numbers, and will convert all strings that don't look like numbers to an error value. (In this case, #VALUE!.)

+0看起来似乎只是一个简单的步骤,但这里有很多事情要做。请记住 仍然是一个字符串数组,而不是数字(即使它们看起来像数字)。 +0将所有看起来像数字的字符串转换为数字,并将所有看起来不像数字的字符串转换为错误值。 (在这种情况下,#VALUE!。)

Sticking with our same example, {"1","-","2"}+0 will equal {1,#VALUE!,2}.

坚持我们的相同例子,{“1”,“ - ”,“2”} + 0将等于{1,#VALUE!,2}。

I will refer to <x3>+0 as <x4> in the next step.

我将在下一步中将 +0称为

MATCH(FALSE,ISNUMBER(<x4>),0) returns the first index of <x4> where it is not a number. The idea here is to find the index of the first non-number, and then include everything up to that index (minus one).

MATCH(FALSE,ISNUMBER( ),0)返回 的第一个索引,其中它不是数字。这里的想法是找到第一个非数字的索引,然后包括该索引的所有内容(减1)。

Sticking with our same example, MATCH(FALSE,ISNUMBER({1,#VALUE!,2}),0) would return 2, because the 2nd index in {1,#VALUE!,2} is the first index that is not a number.

坚持我们的相同例子,MATCH(FALSE,ISNUMBER({1,#VALUE!,2}),0)将返回2,因为{1,#VALUE!,2}中的第二个索引是第一个不是一个号码。

I will refer to MATCH(FALSE,ISNUMBER(<x4>),0) as <x5> in the next step.

我将在下一步中将MATCH(FALSE,ISNUMBER( ),0)称为

It is possible that all values in <x4> are numbers, in which case <x5> would return an error because it can't find a match for a non-number. IFERROR(<x5>,4) fixes this issue. It returns the value 4 if all values in <x5> are numbers. The reason to return 4 is because we are basically saying that all 3 of the characters following AP# are numbers, so the first index that we aren't considering after AP# is the 4th index.

中的所有值都可能是数字,在这种情况下, 将返回错误,因为它找不到非数字的匹配项。 IFERROR( ,4)解决了这个问题。如果 中的所有值都是数字,则返回值4。返回4的原因是因为我们基本上说AP#之后的所有3个字符都是数字,所以我们在AP#之后没有考虑的第一个索引是第4个索引。

I will refer to IFERROR(<x5>,4) as <x6> in the next step.

我将在下一步中将IFERROR( ,4)称为

<x6>+2 may seem like a strange calculation, and it is, so I will write it a different way that will make more sense: (<x6>-1)+3

+2可能看起来像一个奇怪的计算,它是,所以我会用不同的方式写它会更有意义:( -1)+3

Remember what <x6> represents here: It is the index of the first non-number that appears in the string of 3 characters after AP# . Therefore, <x6>-1 is the number of characters to include after AP#.

记住 在这里所代表的内容:它是AP#之后3个字符串中出现的第一个非数字的索引。因此, -1是AP#之后要包含的字符数。

Now, why add 3? (<x6>-1)+3 is necessary to include the 3 characters in AP# itself. This will make sense in the next step.

现在,为什么要加3? ( -1)+3必须包含AP#本身中的3个字符。这将在下一步有意义。

I will refer to <x6>+2 as <x7> in the next step.

我将在下一步中将 +2称为

MID(A1,FIND(AP#,A1),<x7>) returns a portion of string A1, starting at the A in AP# and spanning <x7> characters. And how large is <x7>? It is however many numbers are in the AP# code, plus 3. (Again, we must add 3 to include the 3 AP# characters themselves in the calculation.)

MID(A1,FIND(AP#,A1), )返回字符串A1的一部分,从AP#中的A开始并跨越 字符。 有多大?然而,AP#代码中有许多数字加上3.(同样,我们必须添加3以在计算中包含3个AP#字符本身。)

This is the entire calculation.

这是整个计算。

Come to think of it, you may want to wrap an IFERROR around the entire calculation to take care of cases where AP# isn't found in the string, e.g. something like:

想想看,你可能想要围绕整个计算包装一个IFERROR来处理在字符串中找不到AP#的情况,例如:就像是:

= IFERROR(MID(A1,FIND("AP#",A1),IFERROR(MATCH(FALSE,
  ISNUMBER(MID(MID(A1,FIND("AP#",A1)+3,3),{1,2,3},1)+0),0),4)+2),"no match")

But really that is your call. I'm not sure if this is necessary.

但真的是你的电话。我不确定这是否有必要。

#2


2  

Consider the following User Defined Function:

请考虑以下用户定义函数:

Public Function FindAPcode(s As String) As String
    Dim L As Long, CH As String, i As Long, j As Long

    FindAPcode = ""
    L = Len(s)
    If L = 0 Then Exit Function
    j = InStr(1, s, "AP#") + 3
    If j = 3 Then Exit Function
    FindAPcode = "AP#"

    For i = j To L
        CH = Mid(s, i, 1)
        If IsNumeric(CH) Then
            FindAPcode = FindAPcode & CH
        Else
            Exit Function
        End If
     Next i
End Function

如何在Excel中查找具有特定条件的字符串中的子字符串

User Defined Functions (UDFs) are very easy to install and use:

用户定义函数(UDF)非常易于安装和使用:

  1. ALT-F11 brings up the VBE window
  2. ALT-F11调出VBE窗口

  3. ALT-I ALT-M opens a fresh module
  4. ALT-I ALT-M打开一个新模块

  5. paste the stuff in and close the VBE window
  6. 粘贴内容并关闭VBE窗口

If you save the workbook, the UDF will be saved with it. If you are using a version of Excel later then 2003, you must save the file as .xlsm rather than .xlsx

如果保存工作簿,UDF将随之保存。如果您在2003年之后使用的是Excel版本,则必须将文件另存为.xlsm而不是.xlsx

To remove the UDF:

要删除UDF:

  1. bring up the VBE window as above
  2. 如上所述调出VBE窗口

  3. clear the code out
  4. 清除代码

  5. close the VBE window
  6. 关闭VBE窗口

To use the UDF from Excel:

要从Excel使用UDF:

=myfunction(A1)

To learn more about macros in general, see:

要了解有关宏的更多信息,请参阅:

http://www.mvps.org/dmcritchie/excel/getstarted.htm

and

http://msdn.microsoft.com/en-us/library/ee814735(v=office.14).aspx

and for specifics on UDFs, see:

有关UDF的详细信息,请参阅:

http://www.cpearson.com/excel/WritingFunctionsInVBA.aspx

Macros must be enabled for this to work!

必须启用宏才能使其正常工作!