VBA,在列中搜索特定字符,提取该字符的字符串

时间:2021-12-23 19:04:24

In a specific column, I want to search for a specific character in cells...say "(" or "/". Once this character is found in a cell, I want to extract the part from the beginning of the string upto the point that this character is found, in the cell adjacent to it.

在特定列中,我想在单元格中搜索特定字符...说“(”或“/”。一旦在单元格中找到此字符,我想从字符串的开头提取部分到指出在与其相邻的单元格中找到此字符。

E.g. a few values in the column could look like -

例如。列中的一些值可能看起来像 -

Samsung (india)
Samsung/Dhamal
Blackberry (chikna)
Blackberry/Kala Anda
iPhone - egypt
iPhone 5 * yeda

The output will look like -

输出看起来像 -

Samsung
Samsung
Blackberry
Blackberry
iPhone
iPhone 5

NOTE: The cell values in that specific column, are not static, have no pattern, may contain other special characters as well, are not of a specific length.

注意:该特定列中的单元格值不是静态的,没有模式,也可能包含其他特殊字符,不具有特定长度。

4 个解决方案

#1


4  

This question is well suited for regular expressions. The following function returns the position of the character preceding the first match of a simple regex pattern in a given string. If no match is found, the function returns the length of the string. The function can be combined with the LEFT function to extract the text preceding the match. (The use of LEFT is necessary because, for the sake of simplicity, this function does not implement submatches.)

这个问题非常适合正则表达式。以下函数返回给定字符串中简单正则表达式模式的第一个匹配之前的字符的位置。如果未找到匹配项,则函数返回字符串的长度。该功能可以与LEFT功能组合以提取匹配前的文本。 (使用LEFT是必要的,因为为了简单起见,此函数不实现子匹配。)

The following formula would extract the product names in your sample data:

以下公式将提取样本数据中的产品名称:

  =LEFT(A1,regexmatch(A1," \(|\/| -| \*"))

Breaking down the match pattern " \(|\/| -| \*":

分解匹配模式“\(| \ / | - | \ *”:

  " \("  matches a space followed by a left parenthesis 
         [the backslash escapes the "(", a special character in regular expressions] 

  "|"    signifies an alternative pattern to match

  "\/"   matches a forward slash (/)

  " -"   matches a space followed by a dash (-)

  " \*"  matches a space followed by an asterisk (*).

To learn more about regular expressions, see this regular expression tutorial, one of many available on the web.

要了解有关正则表达式的更多信息,请参阅此正则表达式教程,这是Web上提供的许多教程之一。

In order for the function to work, you will need to set a reference to Microsoft VBScript Regular Expressions 5.5. To do this, select Tools/References from the VBA IDE and check this item, which will be well down the long list of references.

为了使该功能起作用,您需要设置对Microsoft VBScript Regular Expressions 5.5的引用。要执行此操作,请从VBA IDE中选择“工具/引用”,然后选中此项,该项将位于长引用列表的下方。

  Function regexMatch(text As String, rePattern As String)
      'Response to SO post 16591260
      'Adapted from code at http://www.macrostash.com/2011/10/08/
      '    simple-regular-expression-tutorial-for-excel-vba/.

      Dim regEx As New VBScript_RegExp_55.RegExp
      Dim matches As Variant

      regEx.pattern = rePattern
      regEx.IgnoreCase = True 'True to ignore case
      regEx.Global = False    'Return just the first match

      If regEx.Test(text) Then
         Set matches = regEx.Execute(text)
         regexMatch = matches(0).FirstIndex
      Else
         regexMatch = Len(text)
      End If

  End Function 

The following subroutine applies the string extraction to each cell in a specified data column and writes the new string to a specified result column. Although it would be possible to just call the function for each cell in the data column, this would incur the overhead of compiling the regular expression (which applies to all cells) each time the function was called. To avoid this overhead, the subroutine splits the match function in to two parts, with the pattern definition outside the loop through the data cells, and the pattern execution inside the loop.

以下子例程将字符串提取应用于指定数据列中的每个单元格,并将新字符串写入指定的结果列。虽然可以只为数据列中的每个单元调用函数,但每次调用函数时都会产生编译正则表达式(适用于所有单元格)的开销。为了避免这种开销,子程序将匹配函数拆分为两部分,循环外的模式定义通过数据单元,循环内的模式执行。

  Sub SubRegexMatch()
      'Response to SO post 16591260
      'Extracts from string content of each data cell in a specified source
      '   column of the active worksheet the characters to the left of the first
      '   match of a regular expression, and writes the new string to corresponding
      '   rows in a specified result column.
      'Set the regular expression, source column, result column, and first
      '   data row in the "parameters" section
      'Regex match code was adapted from http://www.macrostash.com/2011/10/08/
      '   simple-regular-expression-tutorial-for-excel-vba/

      Dim regEx As New VBScript_RegExp_55.RegExp, _
          matches As Variant, _
          regexMatch As Long     'position of character *just before* match
      Dim srcCol As String, _
          resCol As String
      Dim srcRng As Range, _
          resRng As Range
      Dim firstRow As Long, _
          lastRow As Long
      Dim srcArr As Variant, _
          resArr() As String
      Dim i As Long

      'parameters
      regEx.Pattern = " \(|\/| -| \*"    'regular expression to be matched
      regEx.IgnoreCase = True
      regEx.Global = False               'return only the first match found
      srcCol = "A"                       'source data column
      resCol = "B"                       'result column
      firstRow = 2                       'set to first row with data

      With ActiveSheet
          lastRow = .Cells(Cells.Rows.Count, srcCol).End(xlUp).Row
          Set srcRng = .Range(srcCol & firstRow & ":" & srcCol & lastRow)
          Set resRng = .Range(resCol & firstRow & ":" & resCol & lastRow)
          srcArr = srcRng
          ReDim resArr(1 To lastRow - firstRow + 1)
          For i = 1 To srcRng.Rows.Count
              If regEx.Test(srcArr(i, 1)) Then
                  Set matches = regEx.Execute(srcArr(i, 1))
                  regexMatch = matches(0).FirstIndex
              Else
                  regexMatch = Len(srcArr(i, 1)) 'return length of original string if no match
              End If
              resArr(i) = Left(srcArr(i, 1), regexMatch)
          Next i
          resRng = WorksheetFunction.Transpose(resArr) 'assign result to worksheet
      End With
  End Sub

#2


2  

Something like this will work:

这样的东西会起作用:

=IF(FIND("(",A1),LEFT(A1,FIND("(",A1)-1),IF(FIND("\",A1),LEFT(A1,FIND("\",A1)-1),""))

If you have more than just the two characters nest in some more IF statements. There is a limit on how much of this you can do before hitting the iteration limit for Cell functions.

如果你有两个以上的字符嵌套在更多的IF语句中。在达到Cell函数的迭代限制之前,您可以执行多少操作。

#3


1  

You could use the Split() function. Here is an example:

您可以使用Split()函数。这是一个例子:

Dim text as String
Dim splt as Variant

text = "Samsung/Dhamal"
splt = Split(text, "/")
MsgBox splt(0)

Just do the same for any other character you want to split. More on this on MSDN: http://msdn.microsoft.com/fr-fr/library/6x627e5f%28v=vs.80%29.aspx

对于要拆分的任何其他角色,请执行相同的操作。有关MSDN的更多信息:http://msdn.microsoft.com/fr-fr/library/6x627e5f%28v=vs.80%29.aspx

The other (better?) alternative I see would be to use InStr() with Left(). InStr() returns the position of the first match it finds. Then you just have to crop your string. Here is an example:

我看到的另一个(更好的?)替代方案是使用InStr()和Left()。 InStr()返回它找到的第一个匹配的位置。然后你只需要裁剪你的字符串。这是一个例子:

Dim text as String
Dim position as Integer

text = "Samsung/Dhamal"
position = InStr(text, "/")

If position > 0 then MsgBox Left(text, position)

http://msdn.microsoft.com/fr-fr/library/8460tsh1%28v=vs.80%29.aspx

#4


1  

This should work for you:

这应该适合你:

Public Function IsAlphaNumeric(sChr As String) As Boolean
    IsAlphaNumeric = sChr Like "[0-9A-Za-z]"
End Function

Sub LeftUntilNonAlphaNumericChar()
    Dim cell As Range
    Dim Counter As Integer
    Dim NumCharsLeftOfNonAlphaNumChar As Long
    Set colRng = ActiveSheet.Range("A1:A1000") 'specify range

    For Each cell In colRng
        If Len(cell.Text) > 0 Then
            MyString = cell.Value
            For Counter = Len(cell.Text) To Counter Step -1
                If IsAlphaNumeric(cell.Characters(Counter, 1).Text) = False And cell.Characters(Counter, 1).Text <> " " Then
                    cell.Offset(0, 1).Value = Left(cell.Value, Counter - 1)
                End If
            Next
        End If
    Next cell
End Sub

It doesn't remove trailing whitespaces on the end but a simple addition to the sub could change that if you wanted. Good Luck.

它不会删除末尾的尾随空格,但如果你想要的话,对sub的简单添加可能会改变。祝你好运。

ADDITION: You can get the row of the last cell with data in a column and use that in your range(see below):

附加:您可以使用列中的数据获取最后一个单元格的行,并在您的范围内使用该行(请参见下文):

Public Function IsAlphaNumeric(sChr As String) As Boolean
    IsAlphaNumeric = sChr Like "[0-9A-Za-z]"
End Function

Sub LeftUntilNonAlphaNumericChar()
    Dim cell As Range
    Dim Counter As Integer
    Dim NumCharsLeftOfNonAlphaNumChar As Long

    Dim LastRow As Long
    If Application.Version >= 12# Then
        LastRow = ActiveSheet.Range("A1048575").End(xlUp).Row + 1
        'MsgBox "You are using Excel 2007 or greater."
    Else
        LastRow = ActiveSheet.Range("A65535").End(xlUp).Row + 1
        'MsgBox "You are using Excel 2003 or lesser."
    End If
    Set colRng = ActiveSheet.Range("A1:A" & LastRow) 'specify range

    For Each cell In colRng
        If Len(cell.Text) > 0 Then
            MyString = cell.Value
            For Counter = Len(cell.Text) To Counter Step -1
                If IsAlphaNumeric(cell.Characters(Counter, 1).Text) = False And cell.Characters(Counter, 1).Text <> " " Then
                    cell.Offset(0, 1).Value = Left(cell.Value, Counter - 1)
                End If
            Next
        End If
    Next cell
End Sub

#1


4  

This question is well suited for regular expressions. The following function returns the position of the character preceding the first match of a simple regex pattern in a given string. If no match is found, the function returns the length of the string. The function can be combined with the LEFT function to extract the text preceding the match. (The use of LEFT is necessary because, for the sake of simplicity, this function does not implement submatches.)

这个问题非常适合正则表达式。以下函数返回给定字符串中简单正则表达式模式的第一个匹配之前的字符的位置。如果未找到匹配项,则函数返回字符串的长度。该功能可以与LEFT功能组合以提取匹配前的文本。 (使用LEFT是必要的,因为为了简单起见,此函数不实现子匹配。)

The following formula would extract the product names in your sample data:

以下公式将提取样本数据中的产品名称:

  =LEFT(A1,regexmatch(A1," \(|\/| -| \*"))

Breaking down the match pattern " \(|\/| -| \*":

分解匹配模式“\(| \ / | - | \ *”:

  " \("  matches a space followed by a left parenthesis 
         [the backslash escapes the "(", a special character in regular expressions] 

  "|"    signifies an alternative pattern to match

  "\/"   matches a forward slash (/)

  " -"   matches a space followed by a dash (-)

  " \*"  matches a space followed by an asterisk (*).

To learn more about regular expressions, see this regular expression tutorial, one of many available on the web.

要了解有关正则表达式的更多信息,请参阅此正则表达式教程,这是Web上提供的许多教程之一。

In order for the function to work, you will need to set a reference to Microsoft VBScript Regular Expressions 5.5. To do this, select Tools/References from the VBA IDE and check this item, which will be well down the long list of references.

为了使该功能起作用,您需要设置对Microsoft VBScript Regular Expressions 5.5的引用。要执行此操作,请从VBA IDE中选择“工具/引用”,然后选中此项,该项将位于长引用列表的下方。

  Function regexMatch(text As String, rePattern As String)
      'Response to SO post 16591260
      'Adapted from code at http://www.macrostash.com/2011/10/08/
      '    simple-regular-expression-tutorial-for-excel-vba/.

      Dim regEx As New VBScript_RegExp_55.RegExp
      Dim matches As Variant

      regEx.pattern = rePattern
      regEx.IgnoreCase = True 'True to ignore case
      regEx.Global = False    'Return just the first match

      If regEx.Test(text) Then
         Set matches = regEx.Execute(text)
         regexMatch = matches(0).FirstIndex
      Else
         regexMatch = Len(text)
      End If

  End Function 

The following subroutine applies the string extraction to each cell in a specified data column and writes the new string to a specified result column. Although it would be possible to just call the function for each cell in the data column, this would incur the overhead of compiling the regular expression (which applies to all cells) each time the function was called. To avoid this overhead, the subroutine splits the match function in to two parts, with the pattern definition outside the loop through the data cells, and the pattern execution inside the loop.

以下子例程将字符串提取应用于指定数据列中的每个单元格,并将新字符串写入指定的结果列。虽然可以只为数据列中的每个单元调用函数,但每次调用函数时都会产生编译正则表达式(适用于所有单元格)的开销。为了避免这种开销,子程序将匹配函数拆分为两部分,循环外的模式定义通过数据单元,循环内的模式执行。

  Sub SubRegexMatch()
      'Response to SO post 16591260
      'Extracts from string content of each data cell in a specified source
      '   column of the active worksheet the characters to the left of the first
      '   match of a regular expression, and writes the new string to corresponding
      '   rows in a specified result column.
      'Set the regular expression, source column, result column, and first
      '   data row in the "parameters" section
      'Regex match code was adapted from http://www.macrostash.com/2011/10/08/
      '   simple-regular-expression-tutorial-for-excel-vba/

      Dim regEx As New VBScript_RegExp_55.RegExp, _
          matches As Variant, _
          regexMatch As Long     'position of character *just before* match
      Dim srcCol As String, _
          resCol As String
      Dim srcRng As Range, _
          resRng As Range
      Dim firstRow As Long, _
          lastRow As Long
      Dim srcArr As Variant, _
          resArr() As String
      Dim i As Long

      'parameters
      regEx.Pattern = " \(|\/| -| \*"    'regular expression to be matched
      regEx.IgnoreCase = True
      regEx.Global = False               'return only the first match found
      srcCol = "A"                       'source data column
      resCol = "B"                       'result column
      firstRow = 2                       'set to first row with data

      With ActiveSheet
          lastRow = .Cells(Cells.Rows.Count, srcCol).End(xlUp).Row
          Set srcRng = .Range(srcCol & firstRow & ":" & srcCol & lastRow)
          Set resRng = .Range(resCol & firstRow & ":" & resCol & lastRow)
          srcArr = srcRng
          ReDim resArr(1 To lastRow - firstRow + 1)
          For i = 1 To srcRng.Rows.Count
              If regEx.Test(srcArr(i, 1)) Then
                  Set matches = regEx.Execute(srcArr(i, 1))
                  regexMatch = matches(0).FirstIndex
              Else
                  regexMatch = Len(srcArr(i, 1)) 'return length of original string if no match
              End If
              resArr(i) = Left(srcArr(i, 1), regexMatch)
          Next i
          resRng = WorksheetFunction.Transpose(resArr) 'assign result to worksheet
      End With
  End Sub

#2


2  

Something like this will work:

这样的东西会起作用:

=IF(FIND("(",A1),LEFT(A1,FIND("(",A1)-1),IF(FIND("\",A1),LEFT(A1,FIND("\",A1)-1),""))

If you have more than just the two characters nest in some more IF statements. There is a limit on how much of this you can do before hitting the iteration limit for Cell functions.

如果你有两个以上的字符嵌套在更多的IF语句中。在达到Cell函数的迭代限制之前,您可以执行多少操作。

#3


1  

You could use the Split() function. Here is an example:

您可以使用Split()函数。这是一个例子:

Dim text as String
Dim splt as Variant

text = "Samsung/Dhamal"
splt = Split(text, "/")
MsgBox splt(0)

Just do the same for any other character you want to split. More on this on MSDN: http://msdn.microsoft.com/fr-fr/library/6x627e5f%28v=vs.80%29.aspx

对于要拆分的任何其他角色,请执行相同的操作。有关MSDN的更多信息:http://msdn.microsoft.com/fr-fr/library/6x627e5f%28v=vs.80%29.aspx

The other (better?) alternative I see would be to use InStr() with Left(). InStr() returns the position of the first match it finds. Then you just have to crop your string. Here is an example:

我看到的另一个(更好的?)替代方案是使用InStr()和Left()。 InStr()返回它找到的第一个匹配的位置。然后你只需要裁剪你的字符串。这是一个例子:

Dim text as String
Dim position as Integer

text = "Samsung/Dhamal"
position = InStr(text, "/")

If position > 0 then MsgBox Left(text, position)

http://msdn.microsoft.com/fr-fr/library/8460tsh1%28v=vs.80%29.aspx

#4


1  

This should work for you:

这应该适合你:

Public Function IsAlphaNumeric(sChr As String) As Boolean
    IsAlphaNumeric = sChr Like "[0-9A-Za-z]"
End Function

Sub LeftUntilNonAlphaNumericChar()
    Dim cell As Range
    Dim Counter As Integer
    Dim NumCharsLeftOfNonAlphaNumChar As Long
    Set colRng = ActiveSheet.Range("A1:A1000") 'specify range

    For Each cell In colRng
        If Len(cell.Text) > 0 Then
            MyString = cell.Value
            For Counter = Len(cell.Text) To Counter Step -1
                If IsAlphaNumeric(cell.Characters(Counter, 1).Text) = False And cell.Characters(Counter, 1).Text <> " " Then
                    cell.Offset(0, 1).Value = Left(cell.Value, Counter - 1)
                End If
            Next
        End If
    Next cell
End Sub

It doesn't remove trailing whitespaces on the end but a simple addition to the sub could change that if you wanted. Good Luck.

它不会删除末尾的尾随空格,但如果你想要的话,对sub的简单添加可能会改变。祝你好运。

ADDITION: You can get the row of the last cell with data in a column and use that in your range(see below):

附加:您可以使用列中的数据获取最后一个单元格的行,并在您的范围内使用该行(请参见下文):

Public Function IsAlphaNumeric(sChr As String) As Boolean
    IsAlphaNumeric = sChr Like "[0-9A-Za-z]"
End Function

Sub LeftUntilNonAlphaNumericChar()
    Dim cell As Range
    Dim Counter As Integer
    Dim NumCharsLeftOfNonAlphaNumChar As Long

    Dim LastRow As Long
    If Application.Version >= 12# Then
        LastRow = ActiveSheet.Range("A1048575").End(xlUp).Row + 1
        'MsgBox "You are using Excel 2007 or greater."
    Else
        LastRow = ActiveSheet.Range("A65535").End(xlUp).Row + 1
        'MsgBox "You are using Excel 2003 or lesser."
    End If
    Set colRng = ActiveSheet.Range("A1:A" & LastRow) 'specify range

    For Each cell In colRng
        If Len(cell.Text) > 0 Then
            MyString = cell.Value
            For Counter = Len(cell.Text) To Counter Step -1
                If IsAlphaNumeric(cell.Characters(Counter, 1).Text) = False And cell.Characters(Counter, 1).Text <> " " Then
                    cell.Offset(0, 1).Value = Left(cell.Value, Counter - 1)
                End If
            Next
        End If
    Next cell
End Sub