Nobody likes spam traffic to their website or blog, it messes with your historical data, uses up your bandwidth and you don’t know what kind of activity this traffic is doing on your site such as looking for security weaknesses.
While checking our recent Google Analytics reports this morning for Bloggingpro.com I noticed a sharp traffic spike on one of the pages and decided to drill down and see where this traffic was coming from.
On inspection of the traffic referral sources, there is a domain which looks like Lifehacker but something looks a bit strange with the font:
Let’s check the site out…
Oh dear, this website is not Lifehacker.com. What’s happened here is a web spammer and hacker has changed the ‘k’ in the domain name to ‘ĸ’ so to anyone who isn’t looking closely it looks like the well known tips & life-advice site. After doing some Googling it turns out that this is the activity of the same person who was using a fake Google.com name when he changed the ‘G’ to a ‘ɢ’ as reported by Analyticsedge.com earlier this month.
Taking a closer look at the Language stats also revealed that we had been getting some other fake traffic from that spammer:
Although spam traffic like this in a Google Analytics report is not a security risk (although traffic like this could also be scanning your plugins and looking for security weaknesses) it is not healthy to have your traffic stats messed up so important to get it cleaned up and filtered out as quickly as possible.
How to remove spam traffic from Google Analytics reports?
We are going to do this using filters and segments. Before we start though it is probably a good idea for you to make a copy and backup of the main and unfiltered ‘view’ of your website or blog. To do this from your dashboard go to Admin and then in the View column on the right click View settings, you should then see all the basic settings for this view, on the far right click the Copy view and you now have a backup of your unfiltered view which you can now work on and apply filters.
In your account dashboard after selecting your site go to Admin > Filters then click the red +Add Filter button at the top and then you should see the following:
Fill in the details exactly the same as above, you want to check Create new Filter then give the filter a name, select Exclude and Campaign Source from the drop down menu. Finally copy and paste this regex for the Filter Pattern:
ɢoogl|lifehacĸer
This will then exclude all traffic from any domain with this in the referral path. You can check that the filter is working by clicking Verify this filter at the bottom which should then return a box showing the before and after traffic results when that filter is applied (Note: if you click Save before doing this and then try to verify it later you will most likely get a warning message saying the filter would not have changed your data).
The next step is to create a filter to exclude the annoying ‘Secret.Google.com’ from the Language stats. Create a new filter just like above and then this time select Language for the Filter Field and put this regular expression in the Filter Pattern:
\s[^s]*\s|.{15,}|\.|,
Save that filter and your two filters are now set to exclude any traffic from these sources and they will not appear in your reports in the future. However, if you want to view your data with this spam traffic excluded straight away then you need to create a custom segment. You can do this by going back to the Audience Overview and then click Add Segment, then click the red +New Segment button, then Conditions under Advanced and fill in the fields as follows:
If you run your reports now you should see that any of the traffic from those spam sources is excluded. Phew!
Special thanks and hat tip to Carlos Escalera of Ohow.co for providing in-depth articles on all of the above. I strongly recommend his site for getting to grips with Google Analytics.
If you liked this post and hate bots traffic on your blog then check out my other analytics piece on how to filter out ‘C’ Language also over at Performancing.com 🙂
How To Filter Out C Language or ‘Bot’ Traffic From Google Analytics
Author: David Jones
David has been working in the online media industry for over 7 years. He writes about technical SEO for bloggers and is also a Python coding & Linux enthusiast.
Oh wow! This is great stuff. A friend asked me about this traffic today. Thankfully, I went into my GA, applied your fiters, and now I’m back in business!
B
Even I faced the spam traffic from these spammy sites and wrote an article today itself. It’s good to see you have also written about it and others are finding it helpful
Hey David,
There are many spammers waiting for the vulnerabilities. It’s good to know that people are using the trick of replacing the ‘k’.
Thanks for the info.
~Ravi
It is really ruining the analytic records that we use for future project, I think analytics should do from their side as we can not block large numbers of domain names.