Web traffic data collection programs: Are they always accurate?
The consensus among SEO professionals is that no analytics software is 100 percent accurate. And in some cases not even 70 percent accurate. “So you have to take all data with a grain of salt and measure metrics in terms of percentages (and watching for trends) rather than using them for hard numbers,” said a spokesperson from SEOMoz.
JavaScript Positioning
Studies done by SEOMoz specialists and other Search Marketing companies seem to indicate that Google Analytics (GA) tends to report higher traffic numbers than other web analytic applications. They surmise that this is due in part to the placement of the JavaScript code on the page. Google Analytics requires the user to place the code in the HTML header. While other programs and good SEO practice dictate that the code be placed closer toward the end of the document, after the main body content.
Why might this affect the traffic counts? Because GA spiders get to their corresponding script faster than the other programs. This is due to the fact that the code in is the header and the spiders usually begin their crawl at the top of the HTML document.
Thus, if the user lands on a page where the script is at the bottom, and then navigates away from the page before the spiders and bots get to the script, the visit won’t be recorded.
JavaScript Disabled
In addition to JS positioning, there are a subset of web users who disable their browsers from executing any JS commands. Naturally, then, these users would not be counted when they visit your site.
Proxy Servers
Third, there’s the problem of proxy servers. A proxy server is a server that stands “in front” of a main server and acts as a sort of gatekeeper. When a user requests a URL on Server A, for example, Server A sends the page to Proxy Server B. Then, Server B serves it to the user indirectly.
Proxy servers can cause discrepancies in analytic reporting. For example, say you have analytic scripts installed on a URL. Someone requests to view that URL. The server where your URL info sits is passed on to a proxy server before it reaches the end user. But little did you know that the proxy server either disables the JS counter before it serves the URL page to the end user, or it performs some other such operation that prevents the proper recording of visits.
Search Engine Robots
Another reason for the discrepancy can be attributed to the way analytic software treats search engine robots (or spiders). Some analytic programs are designed to ignore page visits from a robot; others don’t. Apparently, GA is often higher than other analytics applications because it counts each time one of its own bots goes to a URL on which its analytic script is placed; additionally, since Google has different “bots” that do different things, it is possible that GA counts each visit from each of these different bots.
Log File vs. Tag-Based Analytics
Web traffic measuring programs like Google Analytics & WebTrends are called tag-based analytics. This means they measure web traffic trends by the use of scripts – or tags – inserted directly onto the pages for which the user wants stats.
Log File analytics do not measure traffic this way. Rather, these types of applications go directly to the server itself and count how many times the server serves up a particular page. There are various advantages and disadvantages to both forms of traffic reporting programs. But again, the consensus in SEO circles is that both types of data collection programs have limitations that can produce inaccurate numbers under certain circumstances.
My Own Stats: Anecdotal Evidence
I found that for my personal website, picking a random day (Nov 9), GA showed 8 visits, and StatCounter showed 7. Hence, no significant difference. Here are some other data on my website for various dates:
Key
GA = Google Analytics
SC = StatCounter
Nov. 15
GA = 13; SC = 12
Nov. 14
GA = 14; SC = 12
Nov. 13
GA = 12; SC = 12
Nov. 11
GA = 7; SC = 6
Nov. 10
GA = 11; SC = 10
Nov. 9
GA = 8; SC = 9
As you can see, there wasn’t much discrepancy between the two applications, but we’re also dealing with relatively small numbers. Also, it does appear that, in general, SC reports less than what GA does.
No comments:
Post a Comment