使用Regex匹配包含一个词和另一个词的URL

时间:2022-06-05 04:05:03

I am trying to write a regular expression to be used in a Google Analytics goal that will match URLs containing

我正在尝试编写一个正则表达式,用于谷歌分析目标中,该目标将匹配包含的url

?package=whatever

?包=什么

and also

/success

/成功

The user will first visit a page like

用户将首先访问类似的页面

www.website.com/become-client/?package=greatpackage

www.website.com/become-client/?package=greatpackage

and if they purchase they will be lead to this page

如果他们购买,他们将会被引导到这个页面

www.website.com/become-client/?package=greatpackage/success

www.website.com/become-client/?package=greatpackage/success

So based on this I could use the following regex

基于此,我可以使用以下regex

\?package\=greatpackage/success

\ \ ?包= greatpackage /成功

This should match the correct destination and I would be able to use this in the goal settings in Analytics to create a goal for purchases of the greatpackage package.

这应该匹配正确的目的地,我可以在分析的目标设置中使用它来创建购买greatpackage包的目标。

But sometimes the website will use other parameters in addition to ?package. Like ?type, ?media and so on.

但是有时候网站会在包装之外使用其他参数。比如,打字,媒体等等。

?type=business

type =业务吗?

Resulting in URLs like this

产生这样的url

www.website.com/become-client/?package=greatpackage?type=business

www.website.com/become-client/?package=greatpackage?type=business

and if they purchase they will be lead to this page

如果他们购买,他们将会被引导到这个页面

www.website.com/become-client/?package=greatpackage?type=business/success

www.website.com/become-client/?package=greatpackage?type=business/success

Now the /success part is moved away from the ?package part. My questions is how do I write a regex that will still match this URL no matter what other parameters there may be in between the parts?

现在/成功部分已经从包的部分转移了。我的问题是,无论各个部分之间还有什么其他参数,如何编写一个仍然匹配这个URL的regex ?

---update----

- - - - update - - - - -

@jonarz proposed the following and it works like a charm.

@jonarz提出了下面的建议,它很有魅力。

\?package\=greatpackage(.*?)/success

\ \ ?包= greatpackage(. * ?)/成功

But what if there are two products with nearly the same name. For example greatpackage and greatpackageULTRA. The code above will select both. If changing the product names is impossible, how can I then select only one of them?

但如果有两种产品的名称几乎相同呢?例如greatpackage和greatpackageULTRA。上面的代码将选择两者。如果改变产品名称是不可能的,那么我怎么能只选择其中之一呢?

2 个解决方案

#1


2  

The regex that would solve the problem introduced in the edit, would be:

可以解决编辑中引入的问题的正则表达式:

\?package\=greatpackage((\?|\/)(.*?))?\/success(\/|\b)

Here is a test: https://regex101.com/r/jS4cH5/1 and it seems to suit your needs.

这里有一个测试:https://regex101.com/r/jS4cH5/1,它似乎适合您的需要。

#2


0  

If you want to match an url like this one :

如果你想匹配这样的url:

www.website.com/become-client/?package=greatpackage?type=business?other=nada/success

www.website.com/become-client/?package=greatpackage?type=business?other=nada/success

With a group to extract your package type :

使用组提取您的包类型:

.*\?package=([^\/?]+).*\/success

Without group (just matching the url if it's containing package=greatpackage and success)

没有组(只要匹配包含package=greatpackage和success的url)

.*\?package=greatpackage.*\/success

Without group and matching for any package type :

没有组和匹配任何包类型:

.*\?package=[^\/?]+.*\/success

You just need to add .* to match any char (except new lines). The [^/?]* part is there to be sure your package type isn't empty (ie : the first char isn't a / nor ?).

您只需要添加.*来匹配任何字符(除了新行)。(^ / ?是否要确保您的包类型不是空的(例如:第一个char不是a / nor ?)

#1


2  

The regex that would solve the problem introduced in the edit, would be:

可以解决编辑中引入的问题的正则表达式:

\?package\=greatpackage((\?|\/)(.*?))?\/success(\/|\b)

Here is a test: https://regex101.com/r/jS4cH5/1 and it seems to suit your needs.

这里有一个测试:https://regex101.com/r/jS4cH5/1,它似乎适合您的需要。

#2


0  

If you want to match an url like this one :

如果你想匹配这样的url:

www.website.com/become-client/?package=greatpackage?type=business?other=nada/success

www.website.com/become-client/?package=greatpackage?type=business?other=nada/success

With a group to extract your package type :

使用组提取您的包类型:

.*\?package=([^\/?]+).*\/success

Without group (just matching the url if it's containing package=greatpackage and success)

没有组(只要匹配包含package=greatpackage和success的url)

.*\?package=greatpackage.*\/success

Without group and matching for any package type :

没有组和匹配任何包类型:

.*\?package=[^\/?]+.*\/success

You just need to add .* to match any char (except new lines). The [^/?]* part is there to be sure your package type isn't empty (ie : the first char isn't a / nor ?).

您只需要添加.*来匹配任何字符(除了新行)。(^ / ?是否要确保您的包类型不是空的(例如:第一个char不是a / nor ?)