如何阻止我的rails应用程序被机器人击中?

时间:2022-10-31 22:04:55

I'm not even sure I'm using the right terminology, whether this is actually bots or not. I didn't want to use the word 'spam' because it's not like I have comments or posts that are being created/spammed. It looks more like something is making the same repeated request to my domain, which is what made me think it was some kind of bot.

我甚至不确定我是否使用了正确的术语,无论这是否真的是机器人。我不想使用“垃圾邮件”这个词,因为它不像我有正在创建/发送垃圾邮件的评论或帖子。它看起来更像是在向我的域发出相同的重复请求,这让我觉得它是某种机器人。

I've opened my first rails app to the 'public', which is a really a small group of users, <50 currently. That was last Friday. I started having performance issues today, so I looked at the log and I see tons of these RoutingErrors

我已经打开了我的第一个rails应用程序到'public',这是一个非常小的用户组,目前<50。那是上周五。我今天开始遇到性能问题,所以我查看了日志,看到了大量的这些RoutingErrors

ActionController::RoutingError (No route matches "/portalApp/APF/pages/business/util/whichServer.jsp" with {:method=>:get}):

They are filling up the log and I'm assuming this is causing the slowdown. Note the .jsp on the end and this is a rails app, so I've got no urls remotely like this in my app. I mean, the /portalApp I don't even have, so I don't know where this is coming from.

他们正在填写日志,我假设这导致了减速。注意最后的.jsp,这是一个rails应用程序,所以我在我的应用程序中没有这样的远程URL。我的意思是,/ portalApp我甚至没有,所以我不知道它来自哪里。

This is hosted at Dreamhost and I chatted with one of their support people, and he suggested a couple sites that detail using htaccess to block things. But that looks like you need to know the IP or domain that the requests are coming from, which I don't.

这是在Dreamhost上托管的,我与他们的一个支持人员聊天,他建议使用htaccess阻止事情的几个网站。但看起来您需要知道请求来自的IP或域,我不知道。

How can I block this? How can I find the IP or domain from the request? Any other suggestions?

我怎么能阻止这个?如何从请求中找到IP或域?还有其他建议吗?


Follow up info:

跟进信息:

After looking at the access logs, it looks like it's not a bot. Maybe I'm not reading the logs right, but there are valid url requests (generated from within my Flex app) coming from the same IP. So now I'm wondering if it's some kind of plugin generating the requests, but I really don't know. Now I'm wondering if it's possible to block a certain url request, based on a pattern, but I suppose that's a separate question.

查看访问日志后,看起来它不是机器人。也许我没有正确阅读日志,但有来自同一IP的有效网址请求(从我的Flex应用程序中生成)。所以现在我想知道它是否是某种生成请求的插件,但我真的不知道。现在我想知道是否可以根据模式阻止某个网址请求,但我想这是一个单独的问题。

3 个解决方案

#1


2  

Old question, but for people who are still looking for alternatives I suggest checking out Kickstarter's rack-attack gem. Allows not only blacklisting and whitelisting, but also throttling.

老问题,但对于仍在寻找替代品的人,我建议查看Kickstarter的机架攻击宝石。不仅允许列入黑名单和白名单,还允许限制。

#2


0  

These page seems to offer some good advice: Here

这些页面似乎提供了一些很好的建议:这里

The section on blocking by user agent may be something you could look at implementing. Is there anyway you can get the useragent from the bot from your logs? If so look for the unique aspect of the useragent that probably identifies the bot and add the following to .htaccess replacing the relevant bits

有关用户代理阻止的部分可能是您可以查看实现的内容。无论如何,你可以从你的日志中获取bot的useragent吗?如果是这样,请查找可能识别机器人的useragent的唯一方面,并​​将以下内容添加到.htaccess中,替换相关位

BrowserMatchNoCase SpammerRobot bad_bot
Order Deny,Allow
Deny from env=bad_bot

Its detail on that link in more detail and of course, if you can't get the useragent from your logs then this will be of no use to you!

它的详细信息更详细,当然,如果你不能从你的日志中获取useragent那么这对你没用!

#3


0  

You can also update your public/robots.txt file to allow/disallow robots.

您还可以更新public / robots.txt文件以允许/禁止机器人。

http://www.robotstxt.org/wc/robots.html

http://www.robotstxt.org/wc/robots.html

#1


2  

Old question, but for people who are still looking for alternatives I suggest checking out Kickstarter's rack-attack gem. Allows not only blacklisting and whitelisting, but also throttling.

老问题,但对于仍在寻找替代品的人,我建议查看Kickstarter的机架攻击宝石。不仅允许列入黑名单和白名单,还允许限制。

#2


0  

These page seems to offer some good advice: Here

这些页面似乎提供了一些很好的建议:这里

The section on blocking by user agent may be something you could look at implementing. Is there anyway you can get the useragent from the bot from your logs? If so look for the unique aspect of the useragent that probably identifies the bot and add the following to .htaccess replacing the relevant bits

有关用户代理阻止的部分可能是您可以查看实现的内容。无论如何,你可以从你的日志中获取bot的useragent吗?如果是这样,请查找可能识别机器人的useragent的唯一方面,并​​将以下内容添加到.htaccess中,替换相关位

BrowserMatchNoCase SpammerRobot bad_bot
Order Deny,Allow
Deny from env=bad_bot

Its detail on that link in more detail and of course, if you can't get the useragent from your logs then this will be of no use to you!

它的详细信息更详细,当然,如果你不能从你的日志中获取useragent那么这对你没用!

#3


0  

You can also update your public/robots.txt file to allow/disallow robots.

您还可以更新public / robots.txt文件以允许/禁止机器人。

http://www.robotstxt.org/wc/robots.html

http://www.robotstxt.org/wc/robots.html