Tag Archive | "Analytics"

Moz Acquires STAT Search Analytics: We’re Better Together!

Posted by SarahBird

We couldn’t be more thrilled to announce that Moz has acquired STAT Search Analytics!

It’s not hard to figure out why, right? We both share a vision around creating search solutions that will change the industry. We’re both passionate about investing in our customers’ success. Together we provide a massive breadth of high-quality, actionable data and insights for marketers. Combining Moz’s SEO research tools and local search expertise with STAT’s daily localized rankings and SERP analytics, we have the most robust organic search solution in the industry.

I recently sat down with my friend Rob Bucci, our new VP of Research & Development and most recently the CEO of STAT, to talk about how this came to be and what to expect next. Check it out:

You can also read Rob’s thoughts on everything here over on the STAT blog!

With our powers combined…

Over the past few months, Moz’s data has gotten some serious upgrades. Notably, with the launch of our new link index in April, the data that feeds our tools is now 35x larger and 30x fresher than it was before. In August we doubled our keyword corpus and expanded our data for the UK, Canada, and Australia, positioning us to lead the market in keyword research and link building tools. Throughout 2018, we’ve made significant improvements to Moz Local’s UI with a brand-new dashboard, making sure our business listing accuracy tool is as usable as it is useful. Driving the blood, sweat, and tears behind these upgrades is a simple purpose: to provide our customers with the best SEO tools money can buy.

STAT is intimately acquainted with this level of customer obsession. Their team has created the best enterprise-level SERP analysis software on the market. More than just rank tracking, STAT’s data is a treasure trove of consumer research, competitive intel, and the deep search analytics that enable SEOs to level up their game.

Moz + STAT together provide a breadth and depth of data that hasn’t existed before in our industry. Organic search shifts from tactics to strategy when you have this level of insight at your disposal, and we can’t wait to reveal what industry-changing products we’ll build together.

Our shared values and vision

Aside from the technology powerhouse this partnership will build, we also couldn’t have found a better culture fit than STAT. With values like selflessness, ambition, and empathy, STAT embodies TAGFEE. Moz and STAT are elated to be coming together as a single company dedicated to developing the best organic search solutions for our customers while also fostering an awesome culture for our employees.

Innovation awaits!

To Moz and STAT customers: the future is bright. Expect more updates, more innovation, and more high-quality data at your disposal than ever before. As we grow together, you’ll grow with us.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!


Moz Blog

Posted in Latest NewsComments Off

SearchCap: Google+ closing down, Bing Ads Editor in-marketing audiences & Google My Business analytics

Below is what happened in search today, as reported on Search Engine Land and from other places across the web.



Please visit Search Engine Land for the full article.


Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing

Related Articles

Posted in Latest NewsComments Off

Working around Google Analytics to improve your content marketing

The way Google Analytics reports bounce rate and time on page leave a lot to be desired. Contributor Marcus Miller outlines two easy ways to get better data on single-page visits so marketers understand how users engage with their content.



Please visit Search Engine Land for the full article.


Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing

Related Articles

Posted in Latest NewsComments Off

SearchCap: #SMXperts, Google Analytics, cutting content & more

Below is what happened in search today, as reported on Search Engine Land and from other places across the web.



Please visit Search Engine Land for the full article.


Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing

Posted in Latest NewsComments Off

Google Data Studio now gains 16 months of Search Analytics data

Now you can get 16-months of Search Analytics data in the Data Studio, Search analytics API and the beta Search Console reports.



Please visit Search Engine Land for the full article.


Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing

Posted in Latest NewsComments Off

Digital Marketing News: Behavior & Analytics Studies, Facebook’s A/B Testing, & LinkedIn’s Carousel Ads

Perceived Influence Marketing Charts Graph

Perceived Influence Marketing Charts Graph

As Concerns Grow Over Internet Privacy, Most Say Search & Social Have Too Much Power
How Internet users perceive the influence a variety of popular online platforms have over their lives was among the subjects examined in a sizable new joint report by Ipsos, the Internet Society, and the Centre for International Governance Innovation, offering some surprising insight for digital marketers. Marketing Charts

Facebook Experiments with A/B Testing for Page Posts
Facebook has been trying out A/B testing of Facebook Page posts, a feature that if rolled out in earnest could eventually have significant implications for digital marketers. Social Media Today

CMOs Say Digital Marketing Is Most Effective: Nielsen Study
Accurately measuring digital marketing advertising spending’s return on investment remains a challenge, while the overall effectiveness of digital ad spend has grown, according to a fascinating new Nielsen study of chief marketing officers. Broadcasting & Cable

Snapchat Rolls Out Option to ‘Unsend’ Messages, New eCommerce Tools
Snapchat has added several e-commerce tools including an in-app ticket purchase solution, branded augmented-reality games, and has given its users the option to unsend messages. Social Media Today

People Are Changing the Way They Use Social Media
Trust of various social media platforms and how Internet users’ self-censorship has changed since 2013 are among the observations presented in the results of a broad new study conducted by The Atlantic. The Atlantic

Facebook launches tool to let users rate advertisers’ customer service
Facebook has added a feedback tool that lets users rate and review advertisers’ customer service, feedback the company says will help it find and even ban sellers with poor ratings. Marketing Land

2018 June 15 Statistics Image

Google’s about-face on GDPR consent tool is monster win for ad-tech companies
Google reversed its General Data Protection Regulation course recently, allowing publishers to work with an unlimited number of vendors, presenting new opportunities for advertising technology firms. AdAge

LinkedIn rolls out Sponsored Content carousel ads that can include up to 10 customized, swipeable cards
LinkedIn (client) has rolled out a variety of new ad types and more performance metrics for marketers, with its Sponsored Content carousel ads that allow up to 10 custom images. Marketing Land

Report: Facebook is Primary Referrer For Lifestyle Content, Google Search Dominates Rest
What people care about and where they look for relevant answers online are among the marketing-related insights revealed in a recent report from Web analytics firm Parse.ly. Facebook was many users’ go-to source for answers for lifestyle content, while Google was the top source for all other content types. MediaPost

Survey: 87% of mobile marketers see success with location targeting
Location targeting is widely-used and has performed well in the mobile marketing realm, helping increase conversion rates and how well marketers understand their audiences, according to new report data. Marketing Land

ON THE LIGHTER SIDE:

Marketoonist Short-Termism Cartoon

A lighthearted look at marketing short-termism, by Marketoonist Tom Fishburne — Marketoonist

‘The weird one wins’: MailChimp’s CMO on the company’s off-the-wall advertising — The Drum

TOPRANK MARKETING & CLIENTS IN THE NEWS:

  • Lee Odden — Why Content Marketing is Good for B2B Companies — Atomic Reach
  • Lee Odden — Top 2018 Influencers That Might Inspire Your Inner Marketer — Whatagraph
  • Lee Odden — Better than Bonuses: 4 Motivators that Matter More than Money — Workfront
  • Anne Leuman — What’s Trending: Marketing GOOOOOAAAALS! — LinkedIn (client)

Thanks for visiting, and please join us next week for a new selection of the latest digital marketing news, and in the meantime you can follow us at @toprank on Twitter for even more timely daily news. Also, don’t miss the full video summary on our TopRank Marketing TV YouTube Channel.

The post Digital Marketing News: Behavior & Analytics Studies, Facebook’s A/B Testing, & LinkedIn’s Carousel Ads appeared first on Online Marketing Blog – TopRank®.

Online Marketing Blog – TopRank®

Posted in Latest NewsComments Off

Trust Your Data: How to Efficiently Filter Spam, Bots, & Other Junk Traffic in Google Analytics

Posted by Carlosesal

There is no doubt that Google Analytics is one of the most important tools you could use to understand your users’ behavior and measure the performance of your site. There’s a reason it’s used by millions across the world.

But despite being such an essential part of the decision-making process for many businesses and blogs, I often find sites (of all sizes) that do little or no data filtering after installing the tracking code, which is a huge mistake.

Think of a Google Analytics property without filtered data as one of those styrofoam cakes with edible parts. It may seem genuine from the top, and it may even feel right when you cut a slice, but as you go deeper and deeper you find that much of it is artificial.

If you’re one of those that haven’t properly configured their Google Analytics and you only pay attention to the summary reports, you probably won’t notice that there’s all sorts of bogus information mixed in with your real user data.

And as a consequence, you won’t realize that your efforts are being wasted on analyzing data that doesn’t represent the actual performance of your site.

To make sure you’re getting only the real ingredients and prevent you from eating that slice of styrofoam, I’ll show you how to use the tools that GA provides to eliminate all the artificial excess that inflates your reports and corrupts your data.

Common Google Analytics threats

As most of the people I’ve worked with know, I’ve always been obsessed with the accuracy of data, mainly because as a marketer/analyst there’s nothing worse than realizing that you’ve made a wrong decision because your data wasn’t accurate. That’s why I’m continually exploring new ways of improving it.

As a result of that research, I wrote my first Moz post about the importance of filtering in Analytics, specifically about ghost spam, which was a significant problem at that time and still is (although to a lesser extent).

While the methods described there are still quite useful, I’ve since been researching solutions for other types of Google Analytics spam and a few other threats that might not be as annoying, but that are equally or even more harmful to your Analytics.

Let’s review, one by one.

Ghosts, crawlers, and other types of spam

The GA team has done a pretty good job handling ghost spam. The amount of it has been dramatically reduced over the last year, compared to the outbreak in 2015/2017.

However, the millions of current users and the thousands of new, unaware users that join every day, plus the majority’s curiosity to discover why someone is linking to their site, make Google Analytics too attractive a target for the spammers to just leave it alone.

The same logic can be applied to any widely used tool: no matter what security measures it has, there will always be people trying to abuse its reach for their own interest. Thus, it’s wise to add an extra security layer.

Take, for example, the most popular CMS: WordPress. Despite having some built-in security measures, if you don’t take additional steps to protect it (like setting a strong username and password or installing a security plugin), you run the risk of being hacked.

The same happens to Google Analytics, but instead of plugins, you use filters to protect it.

In which reports can you look for spam?

Spam traffic will usually show as a Referral, but it can appear in any part of your reports, even in unsuspecting places like a language or page title.

Sometimes spammers will try to fool by using misleading URLs that are very similar to known websites, or they may try to get your attention by using unusual characters and emojis in the source name.

Independently of the type of spam, there are 3 things you always should do when you think you found one in your reports:

  1. Never visit the suspicious URL. Most of the time they’ll try to sell you something or promote their service, but some spammers might have some malicious scripts on their site.
  2. This goes without saying, but never install scripts from unknown sites; if for some reason you did, remove it immediately and scan your site for malware.
  3. Filter out the spam in your Google Analytics to keep your data clean (more on that below).

If you’re not sure whether an entry on your report is real, try searching for the URL in quotes (“example.com”). Your browser won’t open the site, but instead will show you the search results; if it is spam, you’ll usually see posts or forums complaining about it.

If you still can’t find information about that particular entry, give me a shout — I might have some knowledge for you.

Bot traffic

A bot is a piece of software that runs automated scripts over the Internet for different purposes.

There are all kinds of bots. Some have good intentions, like the bots used to check copyrighted content or the ones that index your site for search engines, and others not so much, like the ones scraping your content to clone it.

2016 bot traffic report. Source: Incapsula

In either case, this type of traffic is not useful for your reporting and might be even more damaging than spam both because of the amount and because it’s harder to identify (and therefore to filter it out).

It’s worth mentioning that bots can be blocked from your server to stop them from accessing your site completely, but this usually involves editing sensible files that require high technical knowledge, and as I said before, there are good bots too.

So, unless you’re receiving a direct attack that’s skewing your resources, I recommend you just filter them in Google Analytics.

In which reports can you look for bot traffic?

Bots will usually show as Direct traffic in Google Analytics, so you’ll need to look for patterns in other dimensions to be able to filter it out. For example, large companies that use bots to navigate the Internet will usually have a unique service provider.

I’ll go into more detail on this below.

Internal traffic

Most users get worried and anxious about spam, which is normal — nobody likes weird URLs showing up in their reports. However, spam isn’t the biggest threat to your Google Analytics.

You are!

The traffic generated by people (and bots) working on the site is often overlooked despite the huge negative impact it has. The main reason it’s so damaging is that in contrast to spam, internal traffic is difficult to identify once it hits your Analytics, and it can easily get mixed in with your real user data.

There are different types of internal traffic and different ways of dealing with it.

Direct internal traffic

Testers, developers, marketing team, support, outsourcing… the list goes on. Any member of the team that visits the company website or blog for any purpose could be contributing.

In which reports can you look for direct internal traffic?

Unless your company uses a private ISP domain, this traffic is tough to identify once it hits you, and will usually show as Direct in Google Analytics.

Third-party sites/tools

This type of internal traffic includes traffic generated directly by you or your team when using tools to work on the site; for example, management tools like Trello or Asana,

It also considers traffic coming from bots doing automatic work for you; for example, services used to monitor the performance of your site, like Pingdom or GTmetrix.

Some types of tools you should consider:

  • Project management
  • Social media management
  • Performance/uptime monitoring services
  • SEO tools
In which reports can you look for internal third-party tools traffic?

This traffic will usually show as Referral in Google Analytics.

Development/staging environments

Some websites use a test environment to make changes before applying them to the main site. Normally, these staging environments have the same tracking code as the production site, so if you don’t filter it out, all the testing will be recorded in Google Analytics.

In which reports can you look for development/staging environments?

This traffic will usually show as Direct in Google Analytics, but you can find it under its own hostname (more on this later).

Web archive sites and cache services

Archive sites like the Wayback Machine offer historical views of websites. The reason you can see those visits on your Analytics — even if they are not hosted on your site — is that the tracking code was installed on your site when the Wayback Machine bot copied your content to its archive.

One thing is for certain: when someone goes to check how your site looked in 2015, they don’t have any intention of buying anything from your site — they’re simply doing it out of curiosity, so this traffic is not useful.

In which reports can you look for traffic from web archive sites and cache services?

You can also identify this traffic on the hostname report.

A basic understanding of filters

The solutions described below use Google Analytics filters, so to avoid problems and confusion, you’ll need some basic understanding of how they work and check some prerequisites.

Things to consider before using filters:

1. Create an unfiltered view.

Before you do anything, it’s highly recommendable to make an unfiltered view; it will help you track the efficacy of your filters. Plus, it works as a backup in case something goes wrong.

2. Make sure you have the correct permissions.

You will need edit permissions at the account level to create filters; edit permissions at view or property level won’t work.

3. Filters don’t work retroactively.

In GA, aggregated historical data can’t be deleted, at least not permanently. That’s why the sooner you apply the filters to your data, the better.

4. The changes made by filters are permanent!

If your filter is not correctly configured because you didn’t enter the correct expression (missing relevant entries, a typo, an extra space, etc.), you run the risk of losing valuable data FOREVER; there is no way of recovering filtered data.

But don’t worry — if you follow the recommendations below, you shouldn’t have a problem.

5. Wait for it.

Most of the time you can see the effect of the filter within minutes or even seconds after applying it; however, officially it can take up to twenty-four hours, so be patient.

Types of filters

There are two main types of filters: predefined and custom.

Predefined filters are very limited, so I rarely use them. I prefer to use the custom ones because they allow regular expressions, which makes them a lot more flexible.

Within the custom filters, there are five types: exclude, include, lowercase/uppercase, search and replace, and advanced.

Here we will use the first two: exclude and include. We’ll save the rest for another occasion.

Essentials of regular expressions

If you already know how to work with regular expressions, you can jump to the next section.

REGEX (short for regular expressions) are text strings prepared to match patterns with the use of some special characters. These characters help match multiple entries in a single filter.

Don’t worry if you don’t know anything about them. We will use only the basics, and for some filters, you will just have to COPY-PASTE the expressions I pre-built.

REGEX special characters

There are many special characters in REGEX, but for basic GA expressions we can focus on three:

  • ^ The caret: used to indicate the beginning of a pattern,
  • $ The dollar sign: used to indicate the end of a pattern,
  • | The pipe or bar: means “OR,” and it is used to indicate that you are starting a new pattern.

When using the pipe character, you should never ever:

  • Put it at the beginning of the expression,
  • Put it at the end of the expression,
  • Put 2 or more together.

Any of those will mess up your filter and probably your Analytics.

A simple example of REGEX usage

Let’s say I go to a restaurant that has an automatic machine that makes fruit salad, and to choose the fruit, you should use regular xxpressions.

This super machine has the following fruits to choose from: strawberry, orange, blueberry, apple, pineapple, and watermelon.

To make a salad with my favorite fruits (strawberry, blueberry, apple, and watermelon), I have to create a REGEX that matches all of them. Easy! Since the pipe character “|” means OR I could do this:

  • REGEX 1: strawberry|blueberry|apple|watermelon

The problem with that expression is that REGEX also considers partial matches, and since pineapple also contains “apple,” it would be selected as well… and I don’t like pineapple!

To avoid that, I can use the other two special characters I mentioned before to make an exact match for apple. The caret “^” (begins here) and the dollar sign “$ ” (ends here). It will look like this:

  • REGEX 2: strawberry|blueberry|^apple$ |watermelon

The expression will select precisely the fruits I want.

But let’s say for demonstration’s sake that the fewer characters you use, the cheaper the salad will be. To optimize the expression, I can use the ability for partial matches in REGEX.

Since strawberry and blueberry both contain “berry,” and no other fruit in the list does, I can rewrite my expression like this:

  • Optimized REGEX: berry|^apple$ |watermelon

That’s it — now I can get my fruit salad with the right ingredients, and at a lower price.

3 ways of testing your filter expression

As I mentioned before, filter changes are permanent, so you have to make sure your filters and REGEX are correct. There are 3 ways of testing them:

  • Right from the filter window; just click on “Verify this filter,” quick and easy. However, it’s not the most accurate since it only takes a small sample of data.

  • Using an online REGEX tester; very accurate and colorful, you can also learn a lot from these, since they show you exactly the matching parts and give you a brief explanation of why.

  • Using an in-table temporary filter in GA; you can test your filter against all your historical data. This is the most precise way of making sure you don’t miss anything.

If you’re doing a simple filter or you have plenty of experience, you can use the built-in filter verification. However, if you want to be 100% sure that your REGEX is ok, I recommend you build the expression on the online tester and then recheck it using an in-table filter.

Quick REGEX challenge

Here’s a small exercise to get you started. Go to this premade example with the optimized expression from the fruit salad case and test the first 2 REGEX I made. You’ll see live how the expressions impact the list.

Now make your own expression to pay as little as possible for the salad.

Remember:

  • We only want strawberry, blueberry, apple, and watermelon;
  • The fewer characters you use, the less you pay;
  • You can do small partial matches, as long as they don’t include the forbidden fruits.

Tip: You can do it with as few as 6 characters.

Now that you know the basics of REGEX, we can continue with the filters below. But I encourage you to put “learn more about REGEX” on your to-do list — they can be incredibly useful not only for GA, but for many tools that allow them.

How to create filters to stop spam, bots, and internal traffic in Google Analytics

Back to our main event: the filters!

Where to start: To avoid being repetitive when describing the filters below, here are the standard steps you need to follow to create them:

  1. Go to the admin section in your Google Analytics (the gear icon at the bottom left corner),
  2. Under the View column (master view), click the button “Filters” (don’t click on “All filters“ in the Account column):
  3. Click the red button “+Add Filter” (if you don’t see it or you can only apply/remove already created filters, then you don’t have edit permissions at the account level. Ask your admin to create them or give you the permissions.):
  4. Then follow the specific configuration for each of the filters below.

The filter window is your best partner for improving the quality of your Analytics data, so it will be a good idea to get familiar with it.

Valid hostname filter (ghost spam, dev environments)

Prevents traffic from:

  • Ghost spam
  • Development hostnames
  • Scraping sites
  • Cache and archive sites

This filter may be the single most effective solution against spam. In contrast with other commonly shared solutions, the hostname filter is preventative, and it rarely needs to be updated.

Ghost spam earns its name because it never really visits your site. It’s sent directly to the Google Analytics servers using a feature called Measurement Protocol, a tool that under normal circumstances allows tracking from devices that you wouldn’t imagine that could be traced, like coffee machines or refrigerators.

Real users pass through your server, then the data is sent to GA; hence it leaves valid information. Ghost spam is sent directly to GA servers, without knowing your site URL; therefore all data left is fake. Source: carloseo.com

The spammer abuses this feature to simulate visits to your site, most likely using automated scripts to send traffic to randomly generated tracking codes (UA-0000000-1).

Since these hits are random, the spammers don’t know who they’re hitting; for that reason ghost spam will always leave a fake or (not set) host. Using that logic, by creating a filter that only includes valid hostnames all ghost spam will be left out.

Where to find your hostnames

Now here comes the “tricky” part. To create this filter, you will need, to make a list of your valid hostnames.

A list of what!?

Essentially, a hostname is any place where your GA tracking code is present. You can get this information from the hostname report:

  • Go to Audience > Select Network > At the top of the table change the primary dimension to Hostname.

If your Analytics is active, you should see at least one: your domain name. If you see more, scan through them and make a list of all the ones that are valid for you.

Types of hostname you can find

The good ones:

Type

Example

Your domain and subdomains

yourdomain.com

Tools connected to your Analytics

YouTube, MailChimp

Payment gateways

Shopify, booking systems

Translation services

Google Translate

Mobile speed-up services

Google weblight

The bad ones (by bad, I mean not useful for your reports):

Type

Example/Description

Staging/development environments

staging.yourdomain.com

Internet archive sites

web.archive.org

Scraping sites that don’t bother to trim the content

The URL of the scraper

Spam

Most of the time they will show their URL, but sometimes they may use the name of a known website to try to fool you. If you see a URL that you don’t recognize, just think, “do I manage it?” If the answer is no, then it isn’t your hostname.

(not set) hostname

It usually comes from spam. On rare occasions it’s related to tracking code issues.

Below is an example of my hostname report. From the unfiltered view, of course, the master view is squeaky clean.

Now with the list of your good hostnames, make a regular expression. If you only have your domain, then that is your expression; if you have more, create an expression with all of them as we did in the fruit salad example:

Hostname REGEX (example)


yourdomain.com|hostname2|hostname3|hostname4

Important! You cannot create more than one “Include hostname filter”; if you do, you will exclude all data. So try to fit all your hostnames into one expression (you have 255 characters).

The “valid hostname filter” configuration:

  • Filter Name: Include valid hostnames
  • Filter Type: Custom > Include
  • Filter Field: Hostname
  • Filter Pattern: [hostname REGEX you created]

Campaign source filter (Crawler spam, internal sources)

Prevents traffic from:

  • Crawler spam
  • Internal third-party tools (Trello, Asana, Pingdom)

Important note: Even if these hits are shown as a referral, the field you should use in the filter is “Campaign source” — the field “Referral” won’t work.

Filter for crawler spam

The second most common type of spam is crawler. They also pretend to be a valid visit by leaving a fake source URL, but in contrast with ghost spam, these do access your site. Therefore, they leave a correct hostname.

You will need to create an expression the same way as the hostname filter, but this time, you will put together the source/URLs of the spammy traffic. The difference is that you can create multiple exclude filters.

Crawler REGEX (example)


spam1|spam2|spam3|spam4

Crawler REGEX (pre-built)


As I promised, here are latest pre-built crawler expressions that you just need to copy/paste.

The “crawler spam filter” configuration:

  • Filter Name: Exclude crawler spam 1
  • Filter Type: Custom > Exclude
  • Filter Field: Campaign source
  • Filter Pattern: [crawler REGEX]

Filter for internal third-party tools

Although you can combine your crawler spam filter with internal third-party tools, I like to have them separated, to keep them organized and more accessible for updates.

The “internal tools filter” configuration:

  • Filter Name: Exclude internal tool sources
  • Filter Pattern: [tool source REGEX]

Internal Tools REGEX (example)


trello|asana|redmine

In case, that one of the tools that you use internally also sends you traffic from real visitors, don’t filter it. Instead, use the “Exclude Internal URL Query” below.

For example, I use Trello, but since I share analytics guides on my site, some people link them from their Trello accounts.

Filters for language spam and other types of spam

The previous two filters will stop most of the spam; however, some spammers use different methods to bypass the previous solutions.

For example, they try to confuse you by showing one of your valid hostnames combined with a well-known source like Apple, Google, or Moz. Even my site has been a target (not saying that everyone knows my site; it just looks like the spammers don’t agree with my guides).

However, even if the source and host look fine, the spammer injects their message in another part of your reports like the keyword, page title, and even as a language.

In those cases, you will have to take the dimension/report where you find the spam and choose that name in the filter. It’s important to consider that the name of the report doesn’t always match the name in the filter field:

Report name

Filter field

Language

Language settings

Referral

Campaign source

Organic Keyword

Search term

Service Provider

ISP Organization

Network Domain

ISP Domain

Here are a couple of examples.

The “language spam/bot filter” configuration:

  • Filter Name: Exclude language spam
  • Filter Type: Custom > Exclude
  • Filter Field: Language settings
  • Filter Pattern: [Language REGEX]

Language Spam REGEX (Prebuilt)


\s[^\s]*\s|.{15,}|\.|,|^c$

The expression above excludes fake languages that don’t meet the required format. For example, take these weird messages appearing instead of regular languages like en-us or es-es:

Examples of language spam

The organic/keyword spam filter configuration:

  • Filter Name: Exclude organic spam
  • Filter Type: Custom > Exclude
  • Filter Field: Search term
  • Filter Pattern: [keyword REGEX]

Filters for direct bot traffic

Bot traffic is a little trickier to filter because it doesn’t leave a source like spam, but it can still be filtered with a bit of patience.

The first thing you should do is enable bot filtering. In my opinion, it should be enabled by default.

Go to the Admin section of your Analytics and click on View Settings. You will find the option “Exclude all hits from known bots and spiders” below the currency selector:

It would be wonderful if this would take care of every bot — a dream come true. However, there’s a catch: the key here is the word “known.” This option only takes care of known bots included in the “IAB known bots and spiders list.” That’s a good start, but far from enough.

There are a lot of “unknown” bots out there that are not included in that list, so you’ll have to play detective and search for patterns of direct bot traffic through different reports until you find something that can be safely filtered without risking your real user data.

To start your bot trail search, click on the Segment box at the top of any report, and select the “Direct traffic” segment.

Then navigate through different reports to see if you find anything suspicious.

Some reports to start with:

  • Service provider
  • Browser version
  • Network domain
  • Screen resolution
  • Flash version
  • Country/City

Signs of bot traffic

Although bots are hard to detect, there are some signals you can follow:

  • An unnatural increase of direct traffic
  • Old versions (browsers, OS, Flash)
  • They visit the home page only (usually represented by a slash “/” in GA)
  • Extreme metrics:
    • Bounce rate close to 100%,
    • Session time close to 0 seconds,
    • 1 page per session,
    • 100% new users.

Important! If you find traffic that checks off many of these signals, it is likely bot traffic. However, not all entries with these characteristics are bots, and not all bots match these patterns, so be cautious.

Perhaps the most useful report that has helped me identify bot traffic is the “Service Provider” report. Large corporations frequently use their own Internet service provider name.

I also have a pre-built expression for ISP bots, similar to the crawler expressions.

The bot ISP filter configuration:

  • Filter Name: Exclude bots by ISP
  • Filter Type: Custom > Exclude
  • Filter Field: ISP organization
  • Filter Pattern: [ISP provider REGEX]

ISP provider bots REGEX (prebuilt)


hubspot|^google\sllc$ |^google\sinc\.$ |alibaba\.com\sllc|ovh\shosting\sinc\.

Latest ISP bot expression

IP filter for internal traffic

We already covered different types of internal traffic, the one from test sites (with the hostname filter), and the one from third-party tools (with the campaign source filter).

Now it’s time to look at the most common and damaging of all: the traffic generated directly by you or any member of your team while working on any task for the site.

To deal with this, the standard solution is to create a filter that excludes the public IP (not private) of all locations used to work on the site.

Examples of places/people that should be filtered

  • Office
  • Support
  • Home
  • Developers
  • Hotel
  • Coffee shop
  • Bar
  • Mall
  • Any place that is regularly used to work on your site

To find the public IP of the location you are working at, simply search for “my IP” in Google. You will see one of these versions:

IP version

Example

Short IPv4

1.23.45.678

Long IPv6

2001:0db8:85a3:0000:0000:8a2e:0370:7334

No matter which version you see, make a list with the IP of each place and put them together with a REGEX, the same way we did with other filters.

  • IP address expression: IP1|IP2|IP3|IP4 and so on.

The static IP filter configuration:

  • Filter Name: Exclude internal traffic (IP)
  • Filter Type: Custom > Exclude
  • Filter Field: IP Address
  • Filter Pattern: [The IP expression]

Cases when this filter won’t be optimal:

There are some cases in which the IP filter won’t be as efficient as it used to be:

  • You use IP anonymization (required by the GDPR regulation). When you anonymize the IP in GA, the last part of the IP is changed to 0. This means that if you have 1.23.45.678, GA will pass it as 1.23.45.0, so you need to put it like that in your filter. The problem is that you might be excluding other IPs that are not yours.
  • Your Internet provider changes your IP frequently (Dynamic IP). This has become a common issue lately, especially if you have the long version (IPv6).
  • Your team works from multiple locations. The way of working is changing — now, not all companies operate from a central office. It’s often the case that some will work from home, others from the train, in a coffee shop, etc. You can still filter those places; however, maintaining the list of IPs to exclude can be a nightmare,
  • You or your team travel frequently. Similar to the previous scenario, if you or your team travels constantly, there’s no way you can keep up with the IP filters.

If you check one or more of these scenarios, then this filter is not optimal for you; I recommend you to try the “Advanced internal URL query filter” below.

URL query filter for internal traffic

If there are dozens or hundreds of employees in the company, it’s extremely difficult to exclude them when they’re traveling, accessing the site from their personal locations, or mobile networks.

Here’s where the URL query comes to the rescue. To use this filter you just need to add a query parameter. I add “?internal” to any link your team uses to access your site:

  • Internal newsletters
  • Management tools (Trello, Redmine)
  • Emails to colleagues
  • Also works by directly adding it in the browser address bar

Basic internal URL query filter

The basic version of this solution is to create a filter to exclude any URL that contains the query “?internal”.

  • Filter Name: Exclude Internal Traffic (URL Query)
  • Filter Type: Custom > Exclude
  • Filter Field: Request URI
  • Filter Pattern: \?internal

This solution is perfect for instances were the user will most likely stay on the landing page, for example, when sending a newsletter to all employees to check a new post.

If the user will likely visit more than the landing page, then the subsequent pages will be recorded.

Advanced internal URL query filter

This solution is the champion of all internal traffic filters!

It’s a more comprehensive version of the previous solution and works by filtering internal traffic dynamically using Google Tag Manager, a GA custom dimension, and cookies.

Although this solution is a bit more complicated to set up, once it’s in place:

  • It doesn’t need maintenance
  • Any team member can use it, no need to explain techy stuff
  • Can be used from any location
  • Can be used from any device, and any browser

To activate the filter, you just have to add the text “?internal” to any URL of the website.

That will insert a small cookie in the browser that will tell GA not to record the visits from that browser.

And the best of it is that the cookie will stay there for a year (unless it is manually removed), so the user doesn’t have to add “?internal” every time.

Bonus filter: Include only internal traffic

In some occasions, it’s interesting to know the traffic generated internally by employees — maybe because you want to measure the success of an internal campaign or just because you’re a curious person.

In that case, you should create an additional view, call it “Internal Traffic Only,” and use one of the internal filters above. Just one! Because if you have multiple include filters, the hit will need to match all of them to be counted.

If you configured the “Advanced internal URL query” filter, use that one. If not, choose one of the others.

The configuration is exactly the same — you only need to change “Exclude” for “Include.”

Cleaning historical data

The filters will prevent future hits from junk traffic.

But what about past affected data?

I know I told you that deleting aggregated historical data is not possible in GA. However, there’s still a way to temporarily clean up at least some of the nasty traffic that has already polluted your reports.

For this, we’ll use an advanced segment (a subset of your Analytics data). There are built-in segments like “Organic” or “Mobile,” but you can also build one using your own set of rules.

To clean our historical data, we will build a segment using all the expressions from the filters above as conditions (except the ones from the IP filter, because IPs are not stored in GA; hence, they can’t be segmented).

To help you get started, you can import this segment template.

You just need to follow the instructions on that page and replace the placeholders. Here is how it looks:

In the actual template, all text is black; the colors are just to help you visualize the conditions.

After importing it, to select the segment:

  1. Click on the box that says “All users” at the top of any of your reports
  2. From your list of segments, check the one that says “0. All Users – Clean”
  3. Lastly, uncheck the “All Users”

Now you can navigate through your reaports and all the junk traffic included in the segment will be removed.

A few things to consider when using this segment:

  • Segments have to be selected each time. A way of having it selected by default is by adding a bookmark when the segment is selected.
  • You can remove or add conditions if you need to.
  • You can edit the segment at any time to update it or add conditions (open the list of segments, then click “Actions” then “Edit”).

  • The hostname expression and third-party tools expression are different for each site.
  • If your site has a large volume of traffic, segments may sample your data when selected, so if you see the little shield icon at the top of your reports go yellow (normally is green), try choosing a shorter period (i.e. 1 year, 6 months, one month).

Conclusion: Which cake would you eat?

Having real and accurate data is essential for your Google Analytics to report as you would expect.

But if you haven’t filtered it properly, it’s almost certain that it will be filled with all sorts of junk and artificial information.

And the worst part is that if don’t realize that your reports contain bogus data, you will likely make wrong or poor decisions when deciding on the next steps for your site or business.

The filters I share above will help you prevent the three most harmful threats that are polluting your Google Analytics and don’t let you get a clear view of the actual performance of your site: spam, bots, and internal traffic.

Once these filters are in place, you can rest assured that your efforts (and money!) won’t be wasted on analyzing deceptive Google Analytics data, and your decisions will be based on solid information.

And the benefits don’t stop there. If you’re using other tools that import data from GA, for example, WordPress plugins like GADWP, excel add-ins like AnalyticsEdge, or SEO suites like Moz Pro, the benefits will trickle down to all of them as well.

Besides highlighting the importance of the filters in GA (which I hope I made clear by now), I would also love for the preparation of these filters to give you the curiosity and basis to create others that will allow you to do all sorts of remarkable things with your data.

Remember, filters not only allow you to keep away junk, you can also use them to rearrange your real user information — but more on that on another occasion.


That’s it! I hope these tips help you make more sense of your data and make accurate decisions.

Have any questions, feedback, experiences? Let me know in the comments, or reach me on Twitter @carlosesal.

Complementary resources:

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!


Moz Blog

Posted in Latest NewsComments Off

How Much Data Is Missing from Analytics? And Other Analytics Black Holes

Posted by Tom.Capper

If you’ve ever compared two analytics implementations on the same site, or compared your analytics with what your business is reporting in sales, you’ve probably noticed that things don’t always match up. In this post, I’ll explain why data is missing from your web analytics platforms and how large the impact could be. Some of the issues I cover are actually quite easily addressed, and have a decent impact on traffic — there’s never been an easier way to hit your quarterly targets. ;)

I’m going to focus on GA (Google Analytics), as it’s the most commonly used provider, but most on-page analytics platforms have the same issues. Platforms that rely on server logs do avoid some issues but are fairly rare, so I won’t cover them in any depth.

Side note: Our test setup (multiple trackers & customized GA)

On Distilled.net, we have a standard Google Analytics property running from an HTML tag in GTM (Google Tag Manager). In addition, for the last two years, I’ve been running three extra concurrent Google Analytics implementations, designed to measure discrepancies between different configurations.

(If you’re just interested in my findings, you can skip this section, but if you want to hear more about the methodology, continue reading. Similarly, don’t worry if you don’t understand some of the detail here — the results are easier to follow.)

Two of these extra implementations — one in Google Tag Manager and one on page — run locally hosted, renamed copies of the Google Analytics JavaScript file (e.g. www.distilled.net/static/js/au3.js, instead of www.google-analytics.com/analytics.js) to make them harder to spot for ad blockers. I also used renamed JavaScript functions (“tcap” and “Buffoon,” rather than the standard “ga”) and renamed trackers (“FredTheUnblockable” and “AlbertTheImmutable”) to avoid having duplicate trackers (which can often cause issues).

This was originally inspired by 2016-era best practice on how to get your Google Analytics setup past ad blockers. I can’t find the original article now, but you can see a very similar one from 2017 here.

Lastly, we have (“DianaTheIndefatigable”), which just has a renamed tracker, but uses the standard code otherwise and is implemented on-page. This is to complete the set of all combinations of modified and unmodified GTM and on-page trackers.

Two of Distilled’s modified on-page trackers, as seen on https://www.distilled.net/

Overall, this table summarizes our setups:

Tracker

Renamed function?

GTM or on-page?

Locally hosted JavaScript file?

Default

No

GTM HTML tag

No

FredTheUnblockable

Yes – “tcap”

GTM HTML tag

Yes

AlbertTheImmutable

Yes – “buffoon”

On page

Yes

DianaTheIndefatigable

No

On page

No

I tested their functionality in various browser/ad-block environments by watching for the pageviews appearing in browser developer tools:

Reason 1: Ad Blockers

Ad blockers, primarily as browser extensions, have been growing in popularity for some time now. Primarily this has been to do with users looking for better performance and UX on ad-laden sites, but in recent years an increased emphasis on privacy has also crept in, hence the possibility of analytics blocking.

Effect of ad blockers

Some ad blockers block web analytics platforms by default, others can be configured to do so. I tested Distilled’s site with Adblock Plus and uBlock Origin, two of the most popular ad-blocking desktop browser addons, but it’s worth noting that ad blockers are increasingly prevalent on smartphones, too.

Here’s how Distilled’s setups fared:

(All numbers shown are from April 2018)

Setup

Vs. Adblock

Vs. Adblock with “EasyPrivacy” enabled

Vs. uBlock Origin

GTM

Pass

Fail

Fail

On page

Pass

Fail

Fail

GTM + renamed script & function

Pass

Fail

Fail

On page + renamed script & function

Pass

Fail

Fail

Seems like those tweaked setups didn’t do much!

Lost data due to ad blockers: ~10%

Ad blocker usage can be in the 15–25% range depending on region, but many of these installs will be default setups of AdBlock Plus, which as we’ve seen above, does not block tracking. Estimates of AdBlock Plus’s market share among ad blockers vary from 50–70%, with more recent reports tending more towards the former. So, if we assume that at most 50% of installed ad blockers block analytics, that leaves your exposure at around 10%.

Reason 2: Browser “do not track”

This is another privacy motivated feature, this time of browsers themselves. You can enable it in the settings of most current browsers. It’s not compulsory for sites or platforms to obey the “do not track” request, but Firefox offers a stronger feature under the same set of options, which I decided to test as well.

Effect of “do not track”

Most browsers now offer the option to send a “Do not track” message. I tested the latest releases of Firefox & Chrome for Windows 10.

Setup

Chrome “do not track”

Firefox “do not track”

Firefox “tracking protection”

GTM

Pass

Pass

Fail

On page

Pass

Pass

Fail

GTM + renamed script & function

Pass

Pass

Fail

On page + renamed script & function

Pass

Pass

Fail

Again, it doesn’t seem that the tweaked setups are doing much work for us here.

Lost data due to “do not track”: <1%

Only Firefox Quantum’s “Tracking Protection,” introduced in February, had any effect on our trackers. Firefox has a 5% market share, but Tracking Protection is not enabled by default. The launch of this feature had no effect on the trend for Firefox traffic on Distilled.net.

Reason 3: Filters

It’s a bit of an obvious one, but filters you’ve set up in your analytics might intentionally or unintentionally reduce your reported traffic levels.

For example, a filter excluding certain niche screen resolutions that you believe to be mostly bots, or internal traffic, will obviously cause your setup to underreport slightly.

Lost data due to filters: ???

Impact is hard to estimate, as setup will obviously vary on a site-by site-basis. I do recommend having a duplicate, unfiltered “master” view in case you realize too late you’ve lost something you didn’t intend to.

Reason 4: GTM vs. on-page vs. misplaced on-page

Google Tag Manager has become an increasingly popular way of implementing analytics in recent years, due to its increased flexibility and the ease of making changes. However, I’ve long noticed that it can tend to underreport vs. on-page setups.

I was also curious about what would happen if you didn’t follow Google’s guidelines in setting up on-page code.

By combining my numbers with numbers from my colleague Dom Woodman’s site (you’re welcome for the link, Dom), which happens to use a Drupal analytics add-on as well as GTM, I was able to see the difference between Google Tag Manager and misplaced on-page code (right at the bottom of the <body> tag) I then weighted this against my own Google Tag Manager data to get an overall picture of all 5 setups.

Effect of GTM and misplaced on-page code

Traffic as a percentage of baseline (standard Google Tag Manager implementation):

Google Tag Manager

Modified & Google Tag Manager

On-Page Code In <head>

Modified & On-Page Code In <head>

On-Page Code Misplaced In <Body>

Chrome

100.00%

98.75%

100.77%

99.80%

94.75%

Safari

100.00%

99.42%

100.55%

102.08%

82.69%

Firefox

100.00%

99.71%

101.16%

101.45%

90.68%

Internet Explorer

100.00%

80.06%

112.31%

113.37%

77.18%

There are a few main takeaways here:

  • On-page code generally reports more traffic than GTM
  • Modified code is generally within a margin of error, apart from modified GTM code on Internet Explorer (see note below)
  • Misplaced analytics code will cost you up to a third of your traffic vs. properly implemented on-page code, depending on browser (!)
  • The customized setups, which are designed to get more traffic by evading ad blockers, are doing nothing of the sort.

It’s worth noting also that the customized implementations actually got less traffic than the standard ones. For the on-page code, this is within the margin of error, but for Google Tag Manager, there’s another reason — because I used unfiltered profiles for the comparison, there’s a lot of bot spam in the main profile, which primarily masquerades as Internet Explorer. Our main profile is by far the most spammed, and also acting as the baseline here, so the difference between on-page code and Google Tag Manager is probably somewhat larger than what I’m reporting.

I also split the data by mobile, out of curiosity:

Traffic as a percentage of baseline (standard Google Tag Manager implementation):

Google Tag Manager

Modified & Google Tag Manager

On-Page Code In <head>

Modified & On-Page Code In <head>

On-Page Code Misplaced In <Body>

Desktop

100.00%

98.31%

100.97%

100.89%

93.47%

Mobile

100.00%

97.00%

103.78%

100.42%

89.87%

Tablet

100.00%

97.68%

104.20%

102.43%

88.13%

The further takeaway here seems to be that mobile browsers, like Internet Explorer, can struggle with Google Tag Manager.

Lost data due to GTM: 1–5%

Google Tag Manager seems to cost you a varying amount depending on what make-up of browsers and devices use your site. On Distilled.net, the difference is around 1.7%; however, we have an unusually desktop-heavy and tech-savvy audience (not much Internet Explorer!). Depending on vertical, this could easily swell to the 5% range.

Lost data due to misplaced on-page code: ~10%

On Teflsearch.com, the impact of misplaced on-page code was around 7.5%, vs Google Tag Manager. Keeping in mind that Google Tag Manager itself underreports, the total loss could easily be in the 10% range.

Bonus round: Missing data from channels

I’ve focused above on areas where you might be missing data altogether. However, there are also lots of ways in which data can be misrepresented, or detail can be missing. I’ll cover these more briefly, but the main issues are dark traffic and attribution.

Dark traffic

Dark traffic is direct traffic that didn’t really come via direct — which is generally becoming more and more common. Typical causes are:

  • Untagged campaigns in email
  • Untagged campaigns in apps (especially Facebook, Twitter, etc.)
  • Misrepresented organic
  • Data sent from botched tracking implementations (which can also appear as self-referrals)

It’s also worth noting the trend towards genuinely direct traffic that would historically have been organic. For example, due to increasingly sophisticated browser autocompletes, cross-device history, and so on, people end up “typing” a URL that they’d have searched for historically.

Attribution

I’ve written about this in more detail here, but in general, a session in Google Analytics (and any other platform) is a fairly arbitrary construct — you might think it’s obvious how a group of hits should be grouped into one or more sessions, but in fact, the process relies on a number of fairly questionable assumptions. In particular, it’s worth noting that Google Analytics generally attributes direct traffic (including dark traffic) to the previous non-direct source, if one exists.

Discussion

I was quite surprised by some of my own findings when researching this post, but I’m sure I didn’t get everything. Can you think of any other ways in which data can end up missing from analytics?

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!


Moz Blog

Posted in Latest NewsComments Off

GDPR: What it Means for Google Analytics & Online Marketing

Posted by Angela_Petteys

If you’ve been on the Internet at all in the past few months, you’ve probably seen plenty of notices about privacy policy updates from one service or another. As a marketer, a few of those notices have most likely come from Google.

With the General Data Privacy Regulation (GDPR) set to go into effect on May 25th, 2018, many Internet services have been scrambling to get in compliance with the new standards — and Google is no exception. Given the nature of the services Google provides to marketers, GDPR absolutely made some significant changes in how they conduct business. And, in turn, some marketers may have to take steps to make sure their use of Google Analytics is allowable under the new rules. But a lot of marketers aren’t entirely sure what exactly GDPR is, what it means for their jobs, and what they need to do to follow the rules.

What is GDPR?

GDPR is a very broad reform that gives citizens who live in the European Economic Area (EEA) and Switzerland more control over how their personal data is collected and used online. GDPR introduces a lot of new rules and if you’re up for a little light reading, you can check out the full text of the regulation online. But here are a few of the most significant changes:

  • Companies and other organizations have to be more transparent and clearly state what information they’re collecting, what it will be used for, how they’re collecting it, and if that information will be shared with anyone else. They can also only collect information that is directly relevant for its intended use. If the organization collecting that information later decides to use it for a different purpose, they must get permission again from each individual.
  • GDPR also spells out how that information needs to be given to consumers. That information can no longer be hidden in long privacy policies filled with legal jargon. The information in disclosures needs to be written in plain language and “freely given, specific, informed, and unambiguous.” Individuals also have to take an action which clearly gives their consent to their information being collected. Pre-checked boxes and notices that rely on inaction as a way of giving consent will no longer be allowed. If a user does not agree to have their information collected, you cannot block them from accessing content based on that fact.
  • Consumers also have the right to see what information a company has about them, request that incorrect information be corrected, revoke permission for their data to be saved, and have their data exported so they can switch to another service. If someone decides to revoke their permission, the organization needs to not only remove that information from their systems in a timely manner, they also need to have it removed from anywhere else they’ve shared that information.
  • Organizations must also be able to give proof of the steps they’re taking to be in compliance. This can include keeping records of how people opt in to being on marketing lists and documentation regarding how customer information is being protected.
  • Once an individual’s information has been collected, GDPR sets out requirements for how that information is stored and protected. If a data breach occurs, consumers must be notified within 72 hours. Failing to comply with GDPR can come with some very steep consequences. If a data breach occurs because of non-compliance, a company can be hit with fines as high as €20 million or 4% of the company’s annual global revenue, whichever amount is greater.

Do US-based businesses need to worry about GDPR?

Just because a business isn’t based in Europe doesn’t necessarily mean they’re off the hook as far as GDPR goes. If a company is based in the United States (or elsewhere outside the EEA), but conducts business in Europe, collects data about users from Europe, markets themselves in Europe, or has employees who work in Europe, GDPR applies to them, too.

Even if you’re working with a company that only conducts business in a very specific geographic area, you might occasionally get some visitors to your site from people outside of that region. For example, let’s say a pizza restaurant in Detroit publishes a blog post about the history of pizza on their site. It’s a pretty informative post and as a result, it brings in some traffic from pizza enthusiasts outside the Detroit area, including a few visitors from Spain. Would GDPR still apply in that sort of situation?

As long as it’s clear that a company’s goods or services are only available to consumers in the United States (or another country outside the EEA), GDPR does not apply. Going back to the pizza restaurant example, the other content on their site is written in English, emphasizes their Detroit location, and definitely doesn’t make any references to delivery to Spain, so those few page views from Spain wouldn’t be anything to worry about.

However, let’s say another US-based company has a site with the option to view German and French language versions of pages, lets customers pay with Euros, and uses marketing language that refers to European customers. In that situation, GDPR would apply since they are more clearly soliciting business from people in Europe.

Google Analytics & GDPR

If you use Google Analytics, Google is your data processor and since they handle data from people all over the world, they’ve had to take steps to become compliant with GDPR standards. However, you/your company are considered the data controller in this relationship and you will also need to take steps to make sure your Google Analytics account is set up to meet the new requirements.

Google has been rolling out some new features to help make this happen. In Analytics, you will now have the ability to delete the information of individual users if they request it. They’ve also introduced data retention settings which allow you to control how long individual user data is saved before being automatically deleted. Google has set this to be 26 months as the default setting, but if you are working with a US-based company that strictly conducts business in the United States, you can set it to never expire if you want to — at least until data protection laws change here, too. It’s important to note that this only applies to data about individual users and events, so aggregate data about high-level information like page views won’t be impacted by this.

To make sure you’re using Analytics in compliance with GDPR, a good place to start is by auditing all the data you collect to make sure it’s all relevant to its intended purpose and that you aren’t accidentally sending any personally identifiable information (PII) to Google Analytics. Sending PII to Google Analytics was already against its Terms of Service, but very often, it happens by accident when information is pushed through in a page URL. If it turns out you are sending PII to Analytics, you’ll need to talk to your web development team about how to fix it because using filters in Analytics to block it isn’t enough — you need to make sure it’s never sent to Google Analytics in the first place.

PII includes anything that can potentially be used to identify a specific person, either on its own or when combined with another piece of information, like an email address, a home address, a birthdate, a zip code, or an IP address. IP addresses weren’t always considered PII, but GDPR classifies them as an online identifier. Don’t worry, though — you can still get geographical insights about the visitors to your site. All you have to do is turn on IP anonymization and the last portion of an IP address will be replaced with a zero, so you can still get a general idea of where your traffic is coming from, although it will be a little less precise.

If you use Google Tag Manager, IP anonymization is pretty easy. Just open your Google Analytics tag or its settings variable, choose “More Settings,” and select “Fields to Set.” Then, choose “anonymizeip” in the “Field Name” box, enter “true” in the “Value” box,” and save your changes.

If you don’t use GTM, talk to your web development team about editing the Google Analytics code to anonymize IP addresses.

Pseudonymous information like user IDs and transaction IDs are still acceptable under GDPR, but it needs to be protected. User and transaction IDs need to be alphanumeric database identifiers, not written out in plain text.

Also, if you haven’t already done so, don’t forget to take the steps Google has mentioned in some of those emails they’ve sent out. If you’re based outside the EEA and GDPR applies to you, go into your Google Analytics account settings and accept the updated terms of processing. If you’re based in the EEA, the updated terms have already been included in your data processing terms. If GDPR applies to you, you’ll also need to go into your organization settings and provide contact information for your organization.

Privacy policies, forms, & cookie notices

Now that you’ve gone through your data and checked your settings in Google Analytics, you need to update your site’s privacy policy, forms, and cookie notices. If your company has a legal department, it may be best to involve them in this process to make sure you’re fully compliant.

Under GDPR, a site’s privacy policy needs to be clearly written in plain language and answer basic questions like what information is being collected, why it’s being collected, how it’s being collected, who is collecting it, how it will be used, and if it will be shared with anyone else. If your site is likely to be visited by children, this information needs to be written simply enough for a child to be able to understand it.

Forms and cookie notices also need to provide that kind of information. Cookie consent forms with really vague, generic messages like, “We use cookies to give you a better experience and by using this site, you agree to our policy,” are not GDPR compliant.

GDPR & other types of marketing

The impact GDPR will have on marketers isn’t just limited to how you use Google Analytics. If you use some particular types of marketing in the course of your job, you may have to make a few other changes, too.

Referral deals

If you work with a company that does “refer a friend”-type promotions where a customer has to enter information for a friend to receive a discount, GDPR is going to make a difference for you. Giving consent for data to be collected is a key part of GDPR and in these sorts of promotions, the person being referred can’t clearly consent to their information being collected. Under GDPR, it is possible to continue this practice, but it all depends on how that information is being used. If you store the information of the person being referred and use it for marketing purposes, it would be a violation of GDPR standards. However, if you don’t store that information or process it, you’re OK.

Email marketing

If you’re an email marketer and already follow best industry standards by doing things like only sending messages to those who clearly opt in to your list and making it easy for people to unsubscribe, the good news is that you’re probably in pretty good shape. As far as email marketing goes, GDPR is going to have the biggest impact on those who do things that have already been considered sketchy, like buying lists of contacts or not making it clear when someone is signing up to receive emails from you.

Even if you think you’re good to go, it’s still a good time to review your contacts and double check that your European contacts have indeed opted into being on your list and that it was clear what they were signing up for. If any of your contacts don’t have their country listed or you’re not sure how they opted in, you may want to either remove them from your list or put them on a separate segment so they don’t get any messages from you until you can get that figured out. Even if you’re confident your European contacts have opted in, there’s no harm in sending out an email asking them to confirm that they would like to continue receiving messages from you.

Creating a double opt-in process isn’t mandatory, but it would be a good idea since it helps remove any doubt over whether or not a person has agreed to being on your list. While you’re at it, take a look at the forms people use to sign up to be on your list and make sure they’re in line with GDPR standards, with no pre-checked boxes and the fact that they’re agreeing to receive emails from you is very clear.

For example, here’s a non-GDPR compliant email signup option I recently saw on a checkout page. They tell you what they’re planning to send to you, but the fact that it’s a pre-checked box placed underneath the more prominent “Place Order” button makes it very easy for people to unintentionally sign up for emails they might not actually want.

Jimmy Choo, on the other hand, also gives you the chance to sign up for emails while making a purchase, but since the box isn’t pre-checked, it’s good to go under GDPR.

Marketing automation

As is the case with standard email marketing, marketing automation specialists will need to make sure they have clear consent from everyone who has agreed to be part of their lists. Check your European contacts to make sure you know how they’ve opted in. Also review the ways people can opt into your list to make sure it’s clear what, exactly, they’re signing up for so that your existing contacts would be considered valid.

If you use marketing automation to re-engage customers who have been inactive for a while, you may need to get permission to contact them again, depending on how long it has been since they last interacted with you.

Some marketing automation platforms have functionality which will be impacted by GDPR. Lead scoring, for example, is now considered a form of profiling and you will need to get permission from individuals to have their information used in that way. Reverse IP tracking also needs consent.

It’s also important to make sure your marketing automation platform and CRM system are set to sync automatically. If a person on your list unsubscribes and continues receiving emails because of a lapse between the two, you could get in trouble for not being GDPR compliant.

Gated content

A lot of companies use gated content, like free reports, whitepapers, or webinars, as a way to generate leads. The way they see it, the person’s information serves as the price of admission. But since GDPR prohibits blocking access to content if a person doesn’t consent to their information being collected, is gated content effectively useless now?

GDPR doesn’t completely eliminate the possibility of gated content, but there are now higher standards for collecting user information. Basically, if you’re going to have gated content, you need to be able to prove that the information you collect is necessary for you to provide the deliverable. For example, if you were organizing a webinar, you’d be justified in collecting email addresses since attendees need to be sent a link to join in. You’d have a harder time claiming an email address was required for something like a whitepaper since that doesn’t necessarily have to be delivered via email. And of course, as with any other form on a site, forms for gated content need to clearly state all the necessary information about how the information being collected will be used.

If you don’t get a lot of leads from European users anyway, you may want to just block all gated content from European visitors. Another option would be to go ahead and make that information freely available to visitors from Europe.

Google AdWords

If you use Google AdWords to advertise to European residents, Google already required publishers and advertisers to get permission from end users by putting disclaimers on the landing page, but GDPR will be making some changes to these requirements. Google will now be requiring publishers to get clear consent from individuals to have their information collected. Not only does this mean you have to give more information about how a person’s information will be used, you’ll also need to keep records of consent and tell users how they can opt out later on if they want to do so. If a person doesn’t give consent to having their information collected, Google will be making it possible to serve them non-personalized ads.

In the end

GDPR is a significant change and trying to grasp the full scope of its changes is pretty daunting. This is far from being a comprehensive guide, so if you have any questions about how GDPR applies to a particular client you’re working with, it may be best to get in touch with their legal department or team. GDPR will impact some industries more than others, so it’s best to get some input from someone who truly understands the law and how it applies to that specific business.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!


Moz Blog

Posted in Latest NewsComments Off

Microsoft adds Reddit data to Bing search results, Power BI analytics tool

Reddit posts will appear in Bing’s search results, and its data will be piped into Power BI for marketers to track brand-related comments.

The post Microsoft adds Reddit data to Bing search results, Power BI analytics tool appeared first on Search Engine Land.



Please visit Search Engine Land for the full article.


Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing

Posted in Latest NewsComments Off

Advert