谷歌分析如何收集其数据?

时间:2022-03-20 14:16:46

Yes, I know you have to embed the google analytics javascript into your page.

是的,我知道你必须将谷歌分析JavaScript嵌入你的页面。

But how is the collected information submitted to the google analytics server?

但是如何将收集的信息提交给谷歌分析服务器?

For example an AJAX request will not be possible because of the browsers security settings (cross domain scripting).

例如,由于浏览器安全设置(跨域脚本),因此无法进行AJAX请求。

Maybe someone had already a look at the confusing google javascript code?

也许有人已经看过令人困惑的谷歌JavaScript代码?

8 个解决方案

#1


When html page makes a request for a ga.js file the http protocol sends big amount of data, about IP, refer, browers, language, system. There is no need to use ajax.

当html页面请求ga.js文件时,http协议会发送大量数据,包括IP,引用,浏览器,语言,系统。没有必要使用ajax。

But still some data cant be achieved this way, so GA script puts image into html with additional parameters, take a look at this example:

但是仍然有一些数据无法通过这种方式实现,因此GA脚本将图像放入带有其他参数的html中,请看一下这个例子:

http://www.google-analytics.com/__utm.gif?utmwv=4.3&utmn=1464271798&utmhn=www.example.com&utmcs=UTF-8&utmsr=1920x1200&utmsc=32-bit&utmul=en-us&utmje=1&utmfl=10.0%20r22&utmdt=Page title&utmhid=1805038256&utmr=0&utmp=/&utmac=cookie value

http://www.google-analytics.com/__utm.gif?utmwv=4.3&utmn=1464271798&utmhn=www.example.com&utmcs=UTF-8&utmsr=1920x1200&utmsc=32-bit&utmul=en-us&utmje=1&utmfl=10.0%20r22&utmdt=Page title&utmhid = 1805038256&utmr = 0&utmp = /&utmac = cookie值

This is a blank image, sometimes called a tracking pixel, that GA puts into HTML.

这是GA放入HTML的空白图像,有时称为跟踪像素。

#2


Some good answers here which individually tend to hit on one method or another for sending the data. There's a valuable reference which I feel is missing from the above answers, though, and covers all the methods.

这里有一些好的答案,它们单独倾向于使用一种方法或另一种方法来发送数据。不过,我认为上述答案中缺少一个有价值的参考资料,并涵盖了所有方法。

Google refers to the different methods of sending data 'transport mechanisms'

谷歌提到了发送数据“传输机制”的不同方法

From the Analytics.js documentation Google mentions the three main transport mechanisms that it uses to send data.

在Analytics.js文档中,Google提到了它用于发送数据的三种主要传输机制。

This specifies the transport mechanism with which hits will be sent. The options are 'beacon', 'xhr', or 'image'. By default, analytics.js will try to figure out the best method based on the hit size and browser capabilities. If you specify 'beacon' and the user's browser does not support the navigator.sendBeacon method, it will fall back to 'image' or 'xhr' depending on hit size.

这指定了将发送命中的传输机制。选项是'beacon','xhr'或'image'。默认情况下,analytics.js将尝试根据命中大小和浏览器功能找出最佳方法。如果您指定'beacon'并且用户的浏览器不支持navigator.sendBeacon方法,则它将根据命中大小回退到'image'或'xhr'。

  1. One of the common and standard ways to send some of the data to Google (which is shown in Thinker's answer) is by adding the data as GET parameters to a tracking pixel. This would fall under the category which Google calls an 'image' transport.
  2. 将一些数据发送给Google的常见和标准方法之一(在Thinker的答案中显示)是将数据作为GET参数添加到跟踪像素。这将属于谷歌称之为“图像”传输的类别。

  3. Secondly, Google can use the 'beacon' transport method if the client's browser supports it. This is often my preferred method because it will attempt to send the information immediately. Or in Google's words:
  4. 其次,如果客户端的浏览器支持,Google可以使用“beacon”传输方法。这通常是我首选的方法,因为它会尝试立即发送信息。或者用谷歌的话说:

This is useful in cases where you wish to track an event just before a user navigates away from your site, without delaying the navigation.

如果您希望在用户导航离开您的网站之前跟踪事件,而不延迟导航,则此功能非常有用。

  1. The 'xhr' transport mechanism is the third way that Google Analytics can send data back home, and the particular transport mechanism that is used can depend on things such as the size of the hit. (I'm not sure what other factors go into GA deciding the optimal transport mechanism to use)
  2. “xhr”传输机制是Google Analytics可以将数据发送回家的第三种方式,使用的特定传输机制可能取决于命中大小等内容。 (我不确定GA决定使用的最佳传输机制还有哪些其他因素)

In case you are curious how to force GA into using a specific transport mechanism, here is a sample code snippet which forces this event hit to be sent as a 'beacon':

如果你很好奇如何强制GA使用特定的传输机制,这里有一个示例代码片段,强制将此事件命中作为“信标”发送:

ga('send', 'event', 'click', 'download-me', {transport: 'beacon'});

Hope this helps.

希望这可以帮助。


Also, if you are curious about this topic because you'd like to capture and send this data to your own site too, I recommend creating a binding to Google Analytics' send, which allows you to grab the payload and AJAX it to your own server.

此外,如果您对此主题感到好奇,因为您也希望捕获这些数据并将其发送到您自己的网站,我建议您创建一个绑定到Google Analytics的send,这样您就可以获取有效负载并将其运行到您自己的AJAX中服务器。

    ga(function(tracker) {

       // Grab a reference to the default sendHitTask function.
       originalSendHitTask = tracker.get('sendHitTask');

       // Modifies sendHitTask to send a copy of the request to a local server after
       // sending the normal request to www.google-analytics.com/collect.
       tracker.set('sendHitTask', function(model) {
         var payload = model.get('hitPayload');
         originalSendHitTask(model);

         var xhr = new XMLHttpRequest();
         xhr.open('POST', '/index.php?task=mycollect', true);
         xhr.send(payload);
       });
    });

#3


Without looking at the code, I assume their data is collected from the HTTP headers they receive in the asynchronous request.

在不查看代码的情况下,我假设他们的数据是从异步请求中收到的HTTP头中收集的。

Remember that most browsers send data such as OS, platform, browser, version, locale, etc... Also they do have the IP so they can guesstimate your location. And I assume they have some sort of clever algorithm to decide whether you are a unique visitor or not.

请记住,大多数浏览器都会发送操作系统,平台,浏览器,版本,区域设置等数据...而且他们确实拥有IP,因此他们可以猜测您的位置。我假设他们有一些聪明的算法来决定你是否是一个独特的访客。

Time on the site is probably calculated by using an onUnload() event.

网站上的时间可能是使用onUnload()事件计算的。

#4


Google Analytics web page provides detailed information of how Google Analytics server collect data. http://code.google.com/apis/analytics/docs/concepts/gaConceptsOverview.html

Google Analytics网页提供了有关Google Analytics服务器如何收集数据的详细信息。 http://code.google.com/apis/analytics/docs/concepts/gaConceptsOverview.html

All Google Analytics data is collected and packed into the Request URL's query string and sent to Google Analytics server. The http request is made by a gif image(http://www.google-analytics.com/__utm.gif) activated by Google Analytics JS.

收集所有Google Analytics数据并将其打包到请求网址的查询字符串中,然后发送到Google Analytics服务器。 http请求由Google Analytics JS激活的gif图片(http://www.google-analytics.com/__utm.gif)制作。

#5


It's easy enough to tell by using something like Firebug's Net tab.

使用Firebug的Net选项卡可以很容易地说出来。

Ajax isn't needed - since data isn't being fetched from Google. They just encode the information in a query string, and then load a transparent gif using it.

不需要Ajax - 因为数据不是从Google获取的。他们只是在查询字符串中编码信息,然后使用它加载透明的gif。

#6


To expand on other very good answers, Google does provide an API to track async "virtual pageviews" which are reported by website authors themselves in their scripts to Google.

为了扩展其他非常好的答案,Google确实提供了一个API来跟踪网站作者自己在脚本中向Google报告的异步“虚拟网页浏览”。

_gaq.push(['_trackPageview', 'my_unique_action']);

They provide it so it is possible to track actions that are not part of regular page views and http requests.

它们提供它,因此可以跟踪不属于常规页面视图和http请求的操作。

Async tracking guide: http://code.google.com/apis/analytics/docs/tracking/asyncUsageGuide.html#Syntax

异步跟踪指南:http://code.google.com/apis/analytics/docs/tracking/asyncUsageGuide.html#Syntax

#7


Use the httpfox or firebug Firefox extension to figure out what HTTP requests the browser sends and what responses it receives.

使用httpfox或firebug Firefox扩展来确定浏览器发送的HTTP请求以及它收到的响应。

I don't know how Google Analytics works, but one possibility is to make the browser download an image: <img src="http://my-analytics.com" width="1" height="1"> (with a single, transparent pixel), and log all the HTTP request headers (e.g. Referer:) on the server side.

我不知道Google Analytics的工作原理,但有一种可能性就是让浏览器下载图片:谷歌分析如何收集其数据?(与单个透明像素),并在服务器端记录所有HTTP请求标头(例如Referer :)。

#8


//edit: see coment at the bottom

//编辑:在底部看到coment

*Ok, find an answer during a discussion with a friend of mine :-) The informations to google analytics are submitted in three ways:

*好的,在与我的一位朋友讨论时找到答案:-)谷歌分析的信息以三种方式提交:

  1. List item
  2. The HTTP Request can be analyzed with all informations of the http headers.
  3. 可以使用http标头的所有信息分析HTTP请求。

  4. A cookie is recognized by the google analytics server.
  5. 谷歌分析服务器识别cookie。

  6. An ajax call is done within the embeded javascript to submit such informations like display resolution, flash player version, etc. These informations are not transmitted via the http headers. *This is possible, because the ajax call is done in the context of the embedded javascript, so its no cross domain scripting. This was an error in reasoning by me.**
  7. 在嵌入式javascript中进行ajax调用以提交诸如显示分辨率,flash播放器版本等信息。这些信息不通过http头传输。 *这是可能的,因为ajax调用是在嵌入式javascript的上下文中完成的,因此它没有跨域脚本。这是我的推理错误。**

#1


When html page makes a request for a ga.js file the http protocol sends big amount of data, about IP, refer, browers, language, system. There is no need to use ajax.

当html页面请求ga.js文件时,http协议会发送大量数据,包括IP,引用,浏览器,语言,系统。没有必要使用ajax。

But still some data cant be achieved this way, so GA script puts image into html with additional parameters, take a look at this example:

但是仍然有一些数据无法通过这种方式实现,因此GA脚本将图像放入带有其他参数的html中,请看一下这个例子:

http://www.google-analytics.com/__utm.gif?utmwv=4.3&utmn=1464271798&utmhn=www.example.com&utmcs=UTF-8&utmsr=1920x1200&utmsc=32-bit&utmul=en-us&utmje=1&utmfl=10.0%20r22&utmdt=Page title&utmhid=1805038256&utmr=0&utmp=/&utmac=cookie value

http://www.google-analytics.com/__utm.gif?utmwv=4.3&utmn=1464271798&utmhn=www.example.com&utmcs=UTF-8&utmsr=1920x1200&utmsc=32-bit&utmul=en-us&utmje=1&utmfl=10.0%20r22&utmdt=Page title&utmhid = 1805038256&utmr = 0&utmp = /&utmac = cookie值

This is a blank image, sometimes called a tracking pixel, that GA puts into HTML.

这是GA放入HTML的空白图像,有时称为跟踪像素。

#2


Some good answers here which individually tend to hit on one method or another for sending the data. There's a valuable reference which I feel is missing from the above answers, though, and covers all the methods.

这里有一些好的答案,它们单独倾向于使用一种方法或另一种方法来发送数据。不过,我认为上述答案中缺少一个有价值的参考资料,并涵盖了所有方法。

Google refers to the different methods of sending data 'transport mechanisms'

谷歌提到了发送数据“传输机制”的不同方法

From the Analytics.js documentation Google mentions the three main transport mechanisms that it uses to send data.

在Analytics.js文档中,Google提到了它用于发送数据的三种主要传输机制。

This specifies the transport mechanism with which hits will be sent. The options are 'beacon', 'xhr', or 'image'. By default, analytics.js will try to figure out the best method based on the hit size and browser capabilities. If you specify 'beacon' and the user's browser does not support the navigator.sendBeacon method, it will fall back to 'image' or 'xhr' depending on hit size.

这指定了将发送命中的传输机制。选项是'beacon','xhr'或'image'。默认情况下,analytics.js将尝试根据命中大小和浏览器功能找出最佳方法。如果您指定'beacon'并且用户的浏览器不支持navigator.sendBeacon方法,则它将根据命中大小回退到'image'或'xhr'。

  1. One of the common and standard ways to send some of the data to Google (which is shown in Thinker's answer) is by adding the data as GET parameters to a tracking pixel. This would fall under the category which Google calls an 'image' transport.
  2. 将一些数据发送给Google的常见和标准方法之一(在Thinker的答案中显示)是将数据作为GET参数添加到跟踪像素。这将属于谷歌称之为“图像”传输的类别。

  3. Secondly, Google can use the 'beacon' transport method if the client's browser supports it. This is often my preferred method because it will attempt to send the information immediately. Or in Google's words:
  4. 其次,如果客户端的浏览器支持,Google可以使用“beacon”传输方法。这通常是我首选的方法,因为它会尝试立即发送信息。或者用谷歌的话说:

This is useful in cases where you wish to track an event just before a user navigates away from your site, without delaying the navigation.

如果您希望在用户导航离开您的网站之前跟踪事件,而不延迟导航,则此功能非常有用。

  1. The 'xhr' transport mechanism is the third way that Google Analytics can send data back home, and the particular transport mechanism that is used can depend on things such as the size of the hit. (I'm not sure what other factors go into GA deciding the optimal transport mechanism to use)
  2. “xhr”传输机制是Google Analytics可以将数据发送回家的第三种方式,使用的特定传输机制可能取决于命中大小等内容。 (我不确定GA决定使用的最佳传输机制还有哪些其他因素)

In case you are curious how to force GA into using a specific transport mechanism, here is a sample code snippet which forces this event hit to be sent as a 'beacon':

如果你很好奇如何强制GA使用特定的传输机制,这里有一个示例代码片段,强制将此事件命中作为“信标”发送:

ga('send', 'event', 'click', 'download-me', {transport: 'beacon'});

Hope this helps.

希望这可以帮助。


Also, if you are curious about this topic because you'd like to capture and send this data to your own site too, I recommend creating a binding to Google Analytics' send, which allows you to grab the payload and AJAX it to your own server.

此外,如果您对此主题感到好奇,因为您也希望捕获这些数据并将其发送到您自己的网站,我建议您创建一个绑定到Google Analytics的send,这样您就可以获取有效负载并将其运行到您自己的AJAX中服务器。

    ga(function(tracker) {

       // Grab a reference to the default sendHitTask function.
       originalSendHitTask = tracker.get('sendHitTask');

       // Modifies sendHitTask to send a copy of the request to a local server after
       // sending the normal request to www.google-analytics.com/collect.
       tracker.set('sendHitTask', function(model) {
         var payload = model.get('hitPayload');
         originalSendHitTask(model);

         var xhr = new XMLHttpRequest();
         xhr.open('POST', '/index.php?task=mycollect', true);
         xhr.send(payload);
       });
    });

#3


Without looking at the code, I assume their data is collected from the HTTP headers they receive in the asynchronous request.

在不查看代码的情况下,我假设他们的数据是从异步请求中收到的HTTP头中收集的。

Remember that most browsers send data such as OS, platform, browser, version, locale, etc... Also they do have the IP so they can guesstimate your location. And I assume they have some sort of clever algorithm to decide whether you are a unique visitor or not.

请记住,大多数浏览器都会发送操作系统,平台,浏览器,版本,区域设置等数据...而且他们确实拥有IP,因此他们可以猜测您的位置。我假设他们有一些聪明的算法来决定你是否是一个独特的访客。

Time on the site is probably calculated by using an onUnload() event.

网站上的时间可能是使用onUnload()事件计算的。

#4


Google Analytics web page provides detailed information of how Google Analytics server collect data. http://code.google.com/apis/analytics/docs/concepts/gaConceptsOverview.html

Google Analytics网页提供了有关Google Analytics服务器如何收集数据的详细信息。 http://code.google.com/apis/analytics/docs/concepts/gaConceptsOverview.html

All Google Analytics data is collected and packed into the Request URL's query string and sent to Google Analytics server. The http request is made by a gif image(http://www.google-analytics.com/__utm.gif) activated by Google Analytics JS.

收集所有Google Analytics数据并将其打包到请求网址的查询字符串中,然后发送到Google Analytics服务器。 http请求由Google Analytics JS激活的gif图片(http://www.google-analytics.com/__utm.gif)制作。

#5


It's easy enough to tell by using something like Firebug's Net tab.

使用Firebug的Net选项卡可以很容易地说出来。

Ajax isn't needed - since data isn't being fetched from Google. They just encode the information in a query string, and then load a transparent gif using it.

不需要Ajax - 因为数据不是从Google获取的。他们只是在查询字符串中编码信息,然后使用它加载透明的gif。

#6


To expand on other very good answers, Google does provide an API to track async "virtual pageviews" which are reported by website authors themselves in their scripts to Google.

为了扩展其他非常好的答案,Google确实提供了一个API来跟踪网站作者自己在脚本中向Google报告的异步“虚拟网页浏览”。

_gaq.push(['_trackPageview', 'my_unique_action']);

They provide it so it is possible to track actions that are not part of regular page views and http requests.

它们提供它,因此可以跟踪不属于常规页面视图和http请求的操作。

Async tracking guide: http://code.google.com/apis/analytics/docs/tracking/asyncUsageGuide.html#Syntax

异步跟踪指南:http://code.google.com/apis/analytics/docs/tracking/asyncUsageGuide.html#Syntax

#7


Use the httpfox or firebug Firefox extension to figure out what HTTP requests the browser sends and what responses it receives.

使用httpfox或firebug Firefox扩展来确定浏览器发送的HTTP请求以及它收到的响应。

I don't know how Google Analytics works, but one possibility is to make the browser download an image: <img src="http://my-analytics.com" width="1" height="1"> (with a single, transparent pixel), and log all the HTTP request headers (e.g. Referer:) on the server side.

我不知道Google Analytics的工作原理,但有一种可能性就是让浏览器下载图片:谷歌分析如何收集其数据?(与单个透明像素),并在服务器端记录所有HTTP请求标头(例如Referer :)。

#8


//edit: see coment at the bottom

//编辑:在底部看到coment

*Ok, find an answer during a discussion with a friend of mine :-) The informations to google analytics are submitted in three ways:

*好的,在与我的一位朋友讨论时找到答案:-)谷歌分析的信息以三种方式提交:

  1. List item
  2. The HTTP Request can be analyzed with all informations of the http headers.
  3. 可以使用http标头的所有信息分析HTTP请求。

  4. A cookie is recognized by the google analytics server.
  5. 谷歌分析服务器识别cookie。

  6. An ajax call is done within the embeded javascript to submit such informations like display resolution, flash player version, etc. These informations are not transmitted via the http headers. *This is possible, because the ajax call is done in the context of the embedded javascript, so its no cross domain scripting. This was an error in reasoning by me.**
  7. 在嵌入式javascript中进行ajax调用以提交诸如显示分辨率,flash播放器版本等信息。这些信息不通过http头传输。 *这是可能的,因为ajax调用是在嵌入式javascript的上下文中完成的,因此它没有跨域脚本。这是我的推理错误。**