谷歌分析如何避免欺骗?

So I'm stuck trying to figure out how Google Analytics avoids spoofing. Sure, when you sign up for an account, they make you verify that you own the domain by uploading a file. But you are also given some script tags with a unique public code (replaced with 'XXXXXXX' below). What's stopping somebody from copying that code, spoofing the request headers, and pretending to be my site by following Google's authentication strategy with curl?

所以我一直在试图弄明白谷歌分析是如何避免欺骗的。当然，当你注册一个帐户时，他们会让你通过上传一个文件来验证你的域名。但是，您也会得到一些带有独特公共代码的脚本标记(替换为下面的“XXXXXXX”)。是什么阻止人们复制代码、欺骗请求头，并通过使用curl遵循谷歌的身份验证策略来冒充我的站点呢?

<script type="text/javascript">

  var _gaq = _gaq || [];
  _gaq.push(['_setAccount', 'XXXXXXX']);
  _gaq.push(['_trackPageview']);

  (function() {
    var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
    ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
    var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
  })();

</script>

The reason I ask is because I'm trying to create a similar JavaScript plugin that exposes my site's data to participating websites ("clients"). I'm not sure how I can get this functionality without a private key on the client's server side. That kind of sucks because I was really going for the whole "easy as Google Analytics to integrate". Any thoughts?

我问这个问题的原因是，我正在尝试创建一个类似的JavaScript插件，将我的站点的数据公开给参与的网站(“客户”)。我不知道如何在客户端服务器端不使用私钥的情况下获得这个功能。这很糟糕，因为我真的很想把“简单的谷歌分析集成到一起”。任何想法吗?

2 个解决方案

#1

It sounds like this question really has nothing to do with Google Analytics (I'd really suggest you remove that from your question as I think it's misleading most people and not getting you closer to your answer).

听起来这个问题真的和谷歌分析毫无关系(我真的建议你把它从你的问题中删除，因为我认为它误导了大多数人，使你无法接近你的答案)。

You have some data and you want to share it with only select sites. There is no other way to do that besides protecting the data with some sort of authorization scheme and then giving the selected sites some sort of password or key that lets them have access to it while others who you did not give the key to will not get access to the data. Even this scheme would only work if the code accessing the data is in a private area on a server (where keys/passwords can be protected), not javascript in a browser.

你有一些数据，你想要分享它只选择网站。没有其他办法,除了保护数据和某种形式的授权模式,然后给所选网站的密码或密钥时,让他们获得别人你没有给谁不会获得的关键数据。即使是这种方案，也只有在访问数据的代码位于服务器上的私有区域(可以保护密钥/密码)，而不是浏览器中的javascript时才有效。

As to the GA spoofing (which I don't think has anything to do with your real question), I suspect that Google doesn't worry about it much because other than a denial of service attack on GA in general (which I suspect they do have protection against), what benefit is there to recording hits for someone else's web site? Whoever is doing it can't get access to the data because the data is in someone else's GA account. I suppose one could do it as annoyance to someone to try to screw up their GA numbers, but without some more profitable motivation, there probably isn't a lot of people trying to do that.

的GA欺骗(我不认为与你真正的问题),我认为谷歌不会担心因为其他比一般拒绝服务攻击GA(我怀疑他们有保护),有什么好处记录满足别人的网站吗?不管是谁，都无法访问数据，因为数据在别人的GA账户里。我想，如果有人试图搞砸自己的GA号码，可能会让人感到厌烦，但如果没有一些更有利可图的动机，可能就不会有很多人这么做了。

#2

Interesting question.

有趣的问题。

As the comments hint, Google doesn't really address this. In fact, it's common to have conditional code / preprocessing stuff to disable GA on your staging site / dev boxes, because if you don't it will screw up your numbers.

正如评论所暗示的，谷歌并没有真正解决这个问题。事实上，有条件代码/预处理的东西来禁用临时站点/开发框上的GA是很常见的，因为如果你不这样做，它会把你的数字搞砸。

You could try a sort of three-legged approach with the analytics server, the customer server, and the client. It could work something like this:

您可以使用分析服务器、客户服务器和客户端尝试一种三脚方法。它可以这样工作:

Customer's server and your analytics server share a secret key. When the client hits the customer's site, the customer's server tells your analytics server it wants to start tracking this particular customer.

客户的服务器和您的分析服务器共享一个密钥。当客户端访问客户的站点时，客户的服务器告诉您的分析服务器，它希望开始跟踪这个特定的客户。
Your analytics server generates a session id for this user, and returns a dynamic URL to the customer's server. The URL points to your JavaScript tracking code (or a loader for it), injected with the session ID.

您的分析服务器为该用户生成一个会话id，并向客户的服务器返回一个动态URL。URL指向您的JavaScript跟踪代码(或它的加载程序)，注入了会话ID。
The customer's server sends the page to the client. The page contains your client-side tracking code with the unique session ID. Actions are tracked and sent to your analytics server.

客户的服务器将页面发送给客户端。页面包含具有唯一会话ID的客户端跟踪代码。操作将被跟踪并发送到您的分析服务器。
On your analytics server, you receive tracking information from the client's machine. You check that the session ID is valid and not expired, and that the IP address matches.

在您的分析服务器上，您从客户端机器接收跟踪信息。检查会话ID是否有效且未过期，以及IP地址是否匹配。

This should provide an extra level of security. Unfortunately, it will not be "easy as Google Analytics to integrate..." it would involve server-side participation on the part of your customers. It also won't do as much good for tracking users who haven't been authenticated by your customers, because a third party could simply visit your customer's site to get a valid session ID and then send some fake info to your analytics server. However, for clients authenticated by your customer's site, it could be useful.

这应该提供额外的安全级别。不幸的是，“像谷歌分析那样集成起来并不容易……”它涉及客户端的服务器端参与。它也不会对那些没有被你的客户认证的用户有什么好处，因为第三方可以简单地访问你的客户的站点，获得一个有效的会话ID，然后发送一些虚假的信息到你的分析服务器。然而，对于通过客户网站认证的客户来说，它可能是有用的。

Good luck!

好运！

#1