Tag Archive | "Reporting"

These are the Google Ads reporting metrics still affected by May 2 bug

The company is still working to fully correct reporting for May 1 and 2.

Please visit Search Engine Land for the full article.

Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing

Posted in Latest NewsComments Off

Google Ads store visits, store sales reporting data partially corrected

Google says it is making progress, but there are still days for which reporting is inaccurate.

Please visit Search Engine Land for the full article.

Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing

Posted in Latest NewsComments Off

All the ABM metrics to measure for your quarterly reporting

The traditional demand funnel doesn’t cut it for account-based marketing — so report on these KPIs instead.

Please visit Search Engine Land for the full article.

Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing

Find More Articles

Posted in Latest NewsComments Off

SearchCap: Reporting delays in Google Search Console, navigate in search & structure data

Below is what happened in search today, as reported on Search Engine Land and from other places across the web.

Please visit Search Engine Land for the full article.

Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing

Posted in Latest NewsComments Off

Automating Technical Reporting for SEO

Posted by petewailes

As the web gets more complex, with JavaScript framework and library front ends on websites, progressive web apps, single-page apps, JSON-LD, and so on, we’re increasingly seeing an ever-greater surface area for things to go wrong. When all you’ve got is HTML and CSS and links, there’s only so much you can mess up. However, in today’s world of dynamically generated websites with universal JS interfaces, there’s a lot of room for errors to creep in.

The second problem we face with much of this is that it’s hard to know when something’s gone wrong, or when Google’s changed how they’re handling something. This is only compounded when you account for situations like site migrations or redesigns, where you might suddenly archive a lot of old content, or re-map a URL structure. How do we address these challenges then?

The old way

Historically, the way you’d analyze things like this is through looking at your log files using Excel or, if you’re hardcore, Log Parser. Those are great, but they require you to know you’ve got an issue, or that you’re looking and happen to grab a section of logs that have the issues you need to address in them. Not impossible, and we’ve written about doing this fairly extensively both in our blog and our log file analysis guide.

The problem with this, though, is fairly obvious. It requires that you look, rather than making you aware that there’s something to look for. With that in mind, I thought I’d spend some time investigating whether there’s something that could be done to make the whole process take less time and act as an early warning system.

A helping hand

The first thing we need to do is to set our server to send log files somewhere. My standard solution to this has become using log rotation. Depending on your server, you’ll use different methods to achieve this, but on Nginx it looks like this:

# time_iso8601 looks like this: 2016-08-10T14:53:00+01:00
if ($  time_iso8601 ~ "^(\d{4})-(\d{2})-(\d{2})") {
        set $  year $  1;
        set $  month $  2;
        set $  day $  3;
<span class="redactor-invisible-space">
</span>access_log /var/log/nginx/$  year-$  month-$  day-access.log;

This allows you to view logs for any specific date or set of dates by simply pulling the data from files relating to that period. Having set up log rotation, we can then set up a script, which we’ll run at midnight using Cron, to pull the log file that relates to yesterday’s data and analyze it. Should you want to, you can look several times a day, or once a week, or at whatever interval best suits your level of data volume.

The next question is: What would we want to look for? Well, once we’ve got the logs for the day, this is what I get my system to report on:

30* status codes

Generate a list of all pages hit by users that resulted in a redirection. If the page linking to that resource is on your site, redirect it to the actual end point. Otherwise, get in touch with whomever is linking to you and get them to sort the link to where it should go.

404 status codes

Similar story. Any 404ing resources should be checked to make sure they’re supposed to be missing. Anything that should be there can be investigated for why it’s not resolving, and links to anything actually missing can be treated in the same way as a 301/302 code.

50* status codes

Something bad has happened and you’re not going to have a good day if you’re seeing many 50* codes. Your server is dying on requests to specific resources, or possibly your entire site, depending on exactly how bad this is.

Crawl budget

A list of every resource Google crawled, how many times it was requested, how many bytes were transferred, and time taken to resolve those requests. Compare this with your site map to find pages that Google won’t crawl, or that it’s hammering, and fix as needed.

Top/least-requested resources

Similar to the above, but detailing the most and least requested things by search engines.

Bad actors

Many bots looking for vulnerabilities will make requests to things like wp_admin, wp_login, 404s, config.php, and other similar common resource URLs. Any IP address that makes repeated requests to these sorts of URLs can be added automatically to an IP blacklist.

Pattern-matched URL reporting

It’s simple to use regex to match requested URLs against pre-defined patterns, to report on specific areas of your site or types of pages. For example, you could report on image requests, Javascript files being called, pagination, form submissions (via looking for POST requests), escaped fragments, query parameters, or virtually anything else. Provided it’s in a URL or HTTP request, you can set it up as a segment to be reported on.

Spiky search crawl behavior

Log the number of requests made by Googlebot every day. If it increases by more than x%, that’s something of interest. As a side note, with most number series, a calculation to spot extreme outliers isn’t hard to create, and is probably worth your time.

Outputting data

Depending on what the importance is of any particular section, you can then set the data up to be logged in a couple of ways. Firstly, large amounts of 40* and 50* status codes or bad actor requests would be worth triggering an email for. This can let you know in a hurry if something’s happening which potentially indicates a large issue. You can then get on top of whatever that may be and resolve it as a matter of priority.

The data as a whole can also be set up to be reported on via a dashboard. If you don’t have that much data in your logs on a daily basis, you may simply want to query the files at runtime and generate the report fresh each time you view it. On the other hand, sites with a lot of traffic and thus larger log files may want to cache the output of each day to a separate file, so the data doesn’t have to be computed. Obviously the type of approach you use to do that depends a lot on the scale you’ll be operating at and how powerful your server hardware is.


Thanks to server logs and basic scripting, there’s no reason you should ever have a situation where something’s amiss on your site and you don’t know about it. Proactive notifications of technical issues is a necessary thing in a world where Google crawls at an ever-faster rate, meaning that they could start pulling your rankings down thanks to site downtime or errors within a matter of hours.

Set up proper monitoring and make sure you’re not caught short!

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Moz Blog

Posted in Latest NewsComments Off

Let Data Take the Wheel – Using API-Integrated Reporting Dashboards

Posted by IanWatson

Some say the only constant thing in this world is change — and that seems to go double for the online marketing and SEO industry. At times this can seem daunting and sometimes insurmountable, but some have found ways to embrace the ambiguity and even thrive on it. Their paths and techniques may all differ slightly, but a commonality exists among them.

That commonality is the utilization of data, mainly via API-driven custom tools and dashboards. APIs like Salesforce’s Chatter, Facebook’s Graph, and our very own Mozscape all allow for massive amounts of useful data to be integrated into your systems.

So, what do you do with all that data?

The use cases are limitless and really depend on your goals, business model, and available resources. Many in our industry, including myself, still rely heavily upon spreadsheets to manage large data sets.

However, the amount of native data and data within reach has grown drastically, and can quickly become unwieldy.

An example of a live reporting dashboard from Klipfolio.

Technology to the rescue!

Business intelligence (BI) is a necessary cog in the machine when it comes to running a successful business. The first step to incorporating BI into your business strategy is to adopt real-time reporting. Much like using Google Maps (yet another API!) on your phone to find your way to a new destination, data visualization companies like Klipfolio, Domo, and Tableau have built live reporting dashboards to help you navigate the wild world of online marketing. These interactive dashboards allow you in integrate data from several sources to better assist you in making real-time decisions.

A basic advertising dashboard.

For example, you could bring your ad campaign, social, and web analytics data into one place and track key metrics and overall performance in real-time. This would allow you to delegate extra resources towards what’s performing best, pulling resources from lagging activities in the funnel as they are occurring. Or perhaps you want to be ahead of the curve and integrate some deep learning into your analysis? Bringing in an API like Alchemy or a custom set-up from Algorithmia could help determine what the next trends are before they even happen. This is where the business world is heading; you don’t want to fall behind.

Resistance is futile.

The possibilities of real-time data analysis are numerous, and the first step towards embracing this new-age necessity is to get your first, simple dashboard set up. We’re here to help. In fact, our friends at Klipfolio were nice enough to give us step-by-step instructions on integrating our Mozscape data, Hubspot data, and social media metrics into their live reporting dashboard — even providing a live demo reporting dashboard. This type of dash allows you to easily create reports, visualize changes in your metrics, and make educated decisions based on hard data.

Create a live reporting dashboard featuring Moz, Hubspot and social data

1. First, you’ll need to create your Mozscape API key. You’ll need to be logged into your existing Moz account, or create a free community or pro Moz account. Once you’re logged in and on the API key page, press “Generate Key.”

2. This is the key you’ll use to access the API and is essentially your password. This is also the key you’ll use for step 6, when you’re integrating this data into Klipfolio.

3. Create a free 14-day Klipfolio trial. Then select “Add a Klip.”

4. The Klip Gallery contains pre-built widgets for your whatever your favorite services might be. You can find Klips for Facebook, Instagram, Alexa, Adobe, Google Adwords and Analytics, and a bunch of other useful integrations. They’re constantly adding more. Plus, in Klipfolio, you can build your own widgets from scratch.

For now, let’s keep it simple. Select “Moz” in the Klip Gallery.

5. Pick the Klip you’d like to add first, then click “Add to Dashboard.”

6. Enter your API key and secret key. If you don’t have one already, you can get your API key and secret ID here.

7. Enter your company URL, followed by your competitors’ URLs.

8. Voilà — it’s that easy! Just like that, you have a live look at backlinks on your own dash.

9. From here, you can add any other Moz widgets you want by repeating steps 5–8. I chose to add in MozRank and Domain Authority Klips.

10. Now let’s add some social data streams onto our dash. I’m going to use Facebook and Twitter, but each of the main social media sites have similar setup processes.

11. Adding in other data sources like Hubspot, Searchmetrics, or Google Analytics simply requires you to bet set up with those parties and to allow Klipfolio access.

12. Now that we have our Klips set up, the only thing left to do is arrange the layout to your liking.

After you have your preferred layout, you’re all set! You’ve now entered the world of business intelligence with your first real-time reporting dashboard. After the free Klipfolio trial is complete, it’s only $ 20/month to continue reporting like the pros. I haven’t found many free tools in this arena, but this plan is about as close as you’ll come.

Take a look at a live demo reporting dash, featuring all of the sources we just went over:

Click to see a larger version.


Just like that, you’ve joined the ranks of Big SEO, reporting like the big industry players. In future posts we’ll bring you more tutorials on building simple tools, utilizing data, and mashing it up with outside sources to better help you navigate the ever-changing world of online business. There’s no denying that, as SEO and marketing professionals, you’re always looking for that next great innovation to give you and your customers a competitive advantage.

From Netflix transitioning into an API-centric business to Amazon diving into the API management industry, the largest and most influential companies out there realize that utilizing large data sets via APIs is the future. Follow suit: Let big data and business intelligence be your guiding light!

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Moz Blog

Posted in Latest NewsComments Off

SearchCap: Google Mobile Usability Reporting, Flash Warnings Expand & Groupon Pages

Below is what happened in search today, as reported on Search Engine Land and from other places across the web. From Search Engine Land: Groupon Pages Part Of Company Evolution Into Local Search Site Depending on your viewpoint, Groupon’s new Pages offering is either a helpful new tool for…

Please visit Search Engine Land for the full article.

Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing

Posted in Latest NewsComments Off

CRO Statistics: How to Avoid Reporting Bad Data

Posted by CraigBradford

Without a basic understanding of statistics, you can often present misleading results to your clients or superiors. This can lead to underwhelming results when you roll out new versions of a page which on paper look like they should perform much better. In this post I want to cover the main aspects of planning, monitoring and interpreting CRO results so that when you do roll out new versions of pages, the results are much closer to what you would expect. I’ve also got a free tool to give away at the end, which does most of this for you.


A large part running a successful conversion optimisation campaign starts before a single visitor reaches the site. Before starting a CRO test it’s important to have:

  1. A hypothesis of what you expect to happen
  2. An estimate of how long the test should take
  3. Analytics set up correctly so that you can measure the effect of the change accurately

Assuming you have a hypothesis, let’s look at predicting how long a test should take.

How long will it take?

As a general rule, the less traffic that your site gets and/or the lower the existing conversion rate, the longer it will take to get statistically significant results. There’s a great tool by Evan Miller that I recommend using before starting any CRO project. Entering the baseline conversion rate and the minimum detectable effect (i.e. What is the minimum percentage change in conversion rate that you care about, 2%? 5%? 20%?) you can get an estimate of how much traffic you’ll need to send to each version. Working backwards from the traffic your site normally gets, you can estimate how long your test is likely to take. When you arrive on the site, you’ll see the following defaults:

Notice the setting that allows you to swap between ‘absolute’ and ‘relative’. Toggling between them will help you understand the difference, but as a general rule, people tend to speak about conversion rate increases in relative terms. For example:

Using a baseline conversion rate of 20%

  • With a 5% absolute improvement – the new conversion rate would be 25%
  • With a 5% relative improvement - the new conversion would be 21%

There’s a huge difference in the sample size needed to detect any change as well. In the absolute example above, 1,030 visits are needed to each branch. If you’re running two test versions against the original, that looks like this:

  • Original – 1,030
  • Version A – 1,030
  • Version B – 1,030

Total 3,090 visits needed.

If you change that to relative, that drastically changes: 25,255 visits are needed for each version. A total of 75,765 visits.

If your site only gets 1,000 visits per month and you have a baseline conversion rate of 20%, it’s going to take you 6 years to detect a significant relative increase in conversion rate of 5% compared to only around 3 months for an absolute change of the same size.

This is why the question of whether or not small sites can do CRO often comes up. The answer is yes, they can, but you’ll want to aim higher than a 5% relative increase in conversions. For example, If you aim for a 35% relative increase (with 20% baseline conversion), you’ll only need 530 visits to each version. In summary, go big if you’re a small site. Don’t test small changes like button changes, test complete new landing pages, otherwise it’s going to take you a very long time to get significantly better results.


A critical part of understanding your test results is having appropriate tracking in place. At Distilled we use Optimizely so that’s what I’ll cover today; fortunately Optimizely makes testing and tracking really easy. All you need is a Google analytics account that has a custom variable (custom dimension in universal analytics) slot free. For either Classic or Universal Analytics, begin by going to the Optimizely Editor, then clicking Options > Analytics Integration. Select enable and enter the custom variable slot that you want to use, that’s it. For more details, see the help section on the Optimizely website here.

With Google analytics tracking enabled, now when you go to the appropriate custom variable slot in Google Analytics, you should see a custom variable named after the experiment name. In the example below the client was using custom variable slot 5:

This is a crucial step. While you can get by by just using Optimizely goals like setting a thankyou page as a conversion, it doesn’t give you the full picture. As well as measuring conversions, you’ll also want to measure behavioral metrics. Using analytics allows you to measure not only conversions, but other metrics like average order value, bounce rates, time on site, secondary conversions etc.

Measuring interaction

Another thing that’s easy to measure with Optimizely is interactions on the page, things like clicking buttons. Even if you don’t have event tracking set up in Google Analytics, you can still measure changes in how people interact with the site. It’s not as simple as it looks though. If you try and track an element in the new version of a page, you’ll get an error message saying that no items are being tracked. See the example from Optimizely below:

Ignore this message, as long as you’ve highlighted the correct button before selecting track clicks, the tracking should work just fine. See the help section on Optimizely for more details.

Interpreting results

Once you have a test up and running, you should start to see results in Google Analytics as well as Optimizely. At this point, there’s a few things to understand before you get too disappointed or excited.

Understanding statistical significance

If you’re using Google analytics for conversion rates, you’ll need something to tell you whether or not your results are statistically significant – I like this tool by Kiss Metrics which looks like this:

It’s easy to look at the above and celebrate your 18% increase in conversions – however you’d be wrong. It’s easier to explain what this means with an example. Let’s imagine you have a pair of dice that we know are exactly the same. If you were to roll each die 100 times, you would expect to see each of the numbers 1-6 the same number of times on both die (which works out at around 17 times per side). Let’s say on this occasion though we are trying to see how good each die is at rolling a 6. Look at the results below:

  • Die A – 17/100 = 0.17 conversion rate
  • Die B – 30/100 = 0.30 conversion rate

A simplistic way to think about Statistical significance is it’s the chance that getting more 6s on the second die was just a fluke and that it hasn’t been optimised in some way to roll 6s.

This makes sense when we think about it. Given that out of 100 rolls we expect to roll a 6 around 17 times, if the second time we rolled a 6 19/100 times, we could believe that we just got lucky. But if we rolled a 6 30/100 times (76% more), we would find it hard to believe that we just got lucky and the second die wasn’t actually a loaded die. If you were to put these numbers into a statistical significance tool (2 sided t-test), it would say that B performed better than A by 76% with 97% significance.

In statistics, statistical significance is the complement of the P value. The P value in this case is 3% and the complement therefore being 97% (100-3 = 97). This means there’s a 3% chance that we’d see results this extreme if the die are identical.

When we see statistical significance in tools like Optimizely, they have just taken the complement of the P-value (100-3 = 97%) and displayed it as the chance to beat baseline. In the example above, we would see a chance to beat baseline of 97%. Notice that I didn’t say there’s a 97% chance of B being 76% better – it’s just that on this occasion the difference was 76% better.

This means that if we were to throw each dice 100 times again, we’re 97% sure we would see noticeable differences again, which may or may not be by as much as 76%. So, with that in mind here is what we can accurately say about the dice experiment:

  • There’s a 97% chance that die B is different to die A

Here’s what we cannot say:

  • There’s a 97% chance that die B will perform 76% better than die A

This still leaves us with the question of what we can expect to happen if we roll version B out. To do this we need to use confidence intervals.

Confidence intervals

Confidence intervals help give us an estimate of how likely a change in a certain range is. To continue with the dice example, we saw an increase in conversions by 76%. Calculating confidence intervals allow us to say things like:

  • We’re 90% sure B will increase the number of 6s you roll by between 19% to 133%
  • We’re 99% sure B will increase the number of 6s you roll by between -13% to 166%

Note: These are relative ranges. That being -13% less than 17% and 166% greater than 17%.

The three questions you might be asking at this point are:

  1. Why is the range so large?
  2. Why is there a chance it could go negative?
  3. How likely is the difference to be on the negative side of the range?

The only way we can reduce the range of the confidence intervals is by collecting more data. To decrease the chance of the difference being less than 0 (we don’t want to roll out a version that performs worse than the original) we need to roll the dice more times. Assuming the same conversion rate of A (0.17%) and B (0.3%) – look at the difference increasing the sample size makes on the range of the confidence intervals.

As you can see, with a sample size of 100 we have a 99% confidence range of -13% to 166%. If we kept rolling the dice until we had a sample size of 10,000 the 99% confidence range looks much better, it’s now between 67% better and 85% better.

The point of showing this is to show that even if you have a statistically significant result, it’s often wise to keep the test running until you have tighter confidence intervals. At the very least I don’t like to present results until the lower limit of the 90% interval is greater than or equal to 0.

Calculating average order value

Sometimes conversion rate on its own doesn’t matter. If you make a change that makes 10% fewer people buy, but those that do buy spend 10x more money, then the net effect is still positive.

To track this we need to be able to see the average order value of the control compared to the test value. If you’ve set up Google analytics integration like I showed previously, this is very easy to do.

If you go into Google analytics, select the custom variable tab, then select the e-commerce view, you’ll see something like:

  • Version A 1000 visits – 10 conversions – Average order value $ 50
  • Version B 1000 visits – 10 conversions – Average order value $ 100

It’s great that people who saw version B appear to spend twice as much, but how do we know if we just got lucky? To do that we need to do some more work. Luckily, there’s a tool that makes this very easy and again this is made by Evan Miller: Two sample t-test tool.

To find out if the change in average order value is significant, we need a list of all the transaction amounts for version A and version B. The steps to do that are below:

1 - Create an advanced segment for version A and version B using the custom variable values.

2 - Individually apply the two segments you’ve just created, go to the transactions report under e-commerce and download all transaction data to a CSV.

3 - Dump data into the two-sample t-test tool

The tool doesn’t accept special characters like $ or £ so remember to remove those before pasting into the tool. As you can see in the image below, I have version A data in the sample 1 area and the transaction values for version B in the sample 2 area. The output can be seen in the image below:

Whether or not the difference is significant is shown below the graphs. In this case the verdict was that sample 1 was in fact significantly different. To find out the difference, look at the “d” value where is says “difference of means”. In the example above the transactions of those people that saw the test version were on average $ 19 more than those that saw the original.

A free tool for reading this far

If you run a lot of CRO tests you’ll find yourself using the above tools a lot. While they are all great tools, I like to have these in one place. One of my colleagues Tom Capper built a spreadsheet which does all of the above very quickly. There’s 2 sheets, conversion rate and average order value. The only data you need to enter in the conversion rate sheet is conversions and sessions, and in the AOV sheet just paste in the transaction values for both data sets. The conversion rate sheet calculates:

  1. Conversion rate
  2. Percentage change
  3. Statistical significance (one sided and two sided)
  4. 90,95 and 99% confidence intervals (Relative and absolute)

There’s an extra field that I’ve found really helpful (working agency side) that’s called “Chance of <=0 uplift”.

If like the example above, you present results that have a potential negative lower range of a confidence interval:

  • We’re 90% sure B will increase the number of 6s you roll by between 19% and 133%
  • We’re 99% sure B will increase the number of 6s you roll by between -13% and 166%

The logical question a client is going to ask is: “What chance is there of the result being negative?”

That’s what this extra field calculates. It gives us the chance of rolling out the new version of a test and the difference being less than or equal to 0%. For the data above, the 99% confidence interval was -13% to +166%. The fact that the lower limit of the range is negative doesn’t look great, but using this calculation, the chance of the difference being <=0% is only 1.41%. Given the potential upside, most clients would agree that this is a chance worth taking.

You can download the spreadsheet here: Statistical Significance.xls

Feel free to say thanks to Tom on Twitter.

This is an internal tool so if it breaks, please don’t send Tom (or me) requests to fix/upgrade or change.

If you want to speed this process up even more, I recommend transferring this spreadsheet into Google docs and using the Google Analytics API to do it automatically. Here’s a good post on how you can do that.

I hope you’ve found this useful and if you have any questions or suggestions please leave a comment.

If you want to learn more about the numbers behind this spreadsheet and statistics in general, some blog posts I’d recommend reading are:

Why your CRO tests fail

How not to run an A/B test

Scientific method: Statistical errors

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Moz Blog

Posted in Latest NewsComments Off

Automate Your SEO Reporting by Exporting Your Leads into Excel

Posted by Brian_Harnish

For any SEO who collects email leads from web forms, the dreaded part of their existence tends to be the end of the month, when it comes to reporting conversion results to clients—verifying, re-verifying, downloading, and exporting them to generate the all-important month-end reports. It can take hours and can be very tedious, but the information gleaned from this process is well worth it. There are, however, ways to optimize your workflow to the point that it almost feels like cheating your way through the process.

By using standalone programs or macros (mini scripts within a program), a project that would normally take hours turns into minutes, and I want to take this opportunity to teach you how to do this on your own. I will use a standalone program and a macro that I found through my research to demonstrate the process so you can get a better idea of what is involved.

How to scrape leads from your Gmail (or almost any other email client)

There are a wide variety of ways to scrape leads from Gmail. You can spend the money to get a program like UBot that will help you automate the task without much effort. You can get a program like iMacros, and spend the time learning how to build proper macros that will scrape from your email box. You can spend the time to learn how to program scripts using Grease Monkey, or you can program your own stand-alone scripts. Whatever you do, you will want a solution that is as quick and easy as possible and helps to automate the task without adding much effort. I found a program on Black Hat World that is made to work on Windows, so you Mac users will need to install Windows to use it. You can download the program here.

While I am aware of the hesitation involved in downloading anything from black-hat websites, my own tests of this tool have worked out well. There are comments and reviews about this tool around the web, and it seems to work well for many users. My own research has not found an instance of this tool doing anything nefarious behind the scenes, and I would not hesitate to use it in my own email scraping.

How it works

This program works by accessing the Gmail account that is added to it and exporting the To:, From:, Body:, and Date: fields from each email. Here is how to use it:

  1. Select the email settings you wish to use to download your emails. You can select To:, From:, Subject, and Date. The “Body” export is disabled; according to the tool’s creator it would end up scraping all of the HTML.

  2. Enter your username. This is your full email address (username@domainname.com).
  3. Enter your password.
  4. Enter the server and port number you wish to use. By default, it’s set to pop.gmail.com and port # 995.
  5. Select whether or not you wish to use a secure connection. This will allow the program to access Gmail whether or not a secure connection is available. If your email does not actually require a secure connection, be sure to uncheck the box.
  6. Once these settings are selected, it will save a file in the email extractor folder with a name that looks like this: 10-1-2013-1-00 AM_Username@gmail.com.

This program is quite useful for those who either do not have or just don’t use Microsoft Outlook. If you have Outlook but are not comfortable with downloading and using this program, you can set Gmail to send your messages to Outlook, and then set up Outlook macros to to export all messages to Excel (covered later in this article).

Be sure you don’t violate your host’s terms of service

This program can also work for other email hosts. Try it! Be sure to put in your applicable login details, and you should be able to scrape your emails without any trouble. However, be sure that you are actually allowed to scrape email from your host. Not all hosts will allow you to do so. Before using egregious scraping on your email account, just double check your terms of service (ToS) so that you don’t accidentally get yourself banned from your email service. Why would an email service not allow scraping? Well, it can cause bandwidth issues if you have hundreds upon hundreds of thousands of emails to export. If this becomes an issue, you may raise an eyebrow or two at your email provider. So, be sure that you really want to do this if you want to place such a large load of use on the email services. The author of this article is not responsible for things that may happen if you do not follow specific terms of service regulations. For your reference, here are the terms of service from several common providers:

Gmail ToS: Gmail does not have any terms that specifically prohibit scraping emails. While Gmail does state you may not access it using a method other than the interface, this is a very gray area that does not provide examples. If someone is collecting lead information for a valid reason like monthly reporting for their own use, there shouldn’t be an issue. If, however, someone is using access via another method in order to take down the Gmail service, then I would imagine this is where the Terms of Service here comes into play. And this is why I mentioned the large bandwidth usage that downloading thousands of emails can cause to a server, for example. Be sure you really want to proceed before doing so and make sure you won’t be somehow banned from your email service as a result. We are not responsible for egregious misuse of a service with intentions to cause interference of the service through significant bandwidth use.

MSN ToS: Does not have any terms that ban exporting emails using any of these methods to export emails. (Be sure to read your own ToS).

Yahoo! ToS: Does not seem to have any terms that prohibit exporting emails. (Be sure to read your own TOS).

Hostgator email limits: While ToS doesn’t specifically seem to limit scraping or exporting of emails, there are policies and limits in place. According to Hostgator’s mail policy and limits page, “Each connecting IP is limited to 30 POP checks per hour.” Possible interference issues with Hostgator services and this software can occur if you are using the software 100s of times per hour, for example. However, because it uses at least one pop check in order to download your emails, you shouldn’t have too many issues unless you continue multiple downloads of emails from your account per hour. In which case, you will “likely get a password error indicating that the login is incorrect.” Such an issue corrects itself within an hour and the email checking will automatically unlock.

Also according to their mail policy and limits page, their VPS plan and Dedicated do not have the same restrictions as their shared accounts do, so you will probably have more success with high-volume scraping on your own private servers.

A fair warning, however: I haven’t specifically tested this with Hostgator, so be sure to use caution when exporting too many times.

Importing your scraped file into Excel

Once you have scraped your email and it saves it as a text file, it shows up all garbled. What we want to do now is import it into Excel so it displays all of the tab-delimited items as columns, so that we don’t have to manually copy and paste every single one. To do this, let’s open up our file in Excel by clicking on File > Import.

It will ask you: What type of file do you want to import? By default it has selected the CSV format but let’s select the text file format since our program saved this to a text file.

Now, click the file that you want to open and click on “Get Data.” The text import wizard will pop up showing you settings to choose from. Select the “Delimited” option unless it is already checked by default. Then click on Next.

In this step you can set the delimiters that your data contains. Remember when we selected the semicolon back while importing our file? Select the semicolon option here. Then, let’s click on next.

Here, we can set up our columns and set the data format. For our purposes, however, let’s just go with the default options.

Now, it will ask you where you want to put the data. You have a choice of Existing Sheet (which starts at =$ A$ 1), new sheet, and pivot table. For the purposes of this article, let’s just go with the default and click on OK.

Here, you see we have perfectly aligned columns and data without much work. Now you can move forward with formatting these columns and data in whatever orientations or pivot tables you like.

How to download leads from Outlook to Excel

For those who use Outlook, depending on your version, it can be cumbersome to get the data out of the program and can take longer than in just about every other program. Thankfully, Outlook features macros which can be used to export all of your data in the span of just a few seconds!

Step 1: Find or create the macro script you want to use

There are a ton of options and configurations available for this task. For our purposes, we will use modified versions of the scripts located here.

Before we get started, we will need to get the basic code from the very first code snippet, shown below. This code only exports the Subject, Received Time, and Sender of the email message. Our goal is to modify this script so that our new code will extract the entire body of the message and output it to the spreadsheet as well. Don’t worry! I am going over each line of code that we modify in this tutorial! This way, you will understand exactly what we are doing and why.

Sub ExportMessagesToExcel()
  Dim olkMsg As Object, _
     excApp As Object, _
     excWkb As Object, _
     excWks As Object, _
     intRow As Integer, _
     intVersion As Integer, _
     strFilename As String
  strFilename = InputBox("Enter a filename (including path) to save the exported messages to.", "Export Messages to Excel")
  If strFilename <> "" Then
     intVersion = GetOutlookVersion()
     Set excApp = CreateObject("Excel.Application")
     Set excWkb = excApp.Workbooks.Add()<br>  Set excWks = excWkb.ActiveSheet
     'Write Excel Column Headers
     With excWks
        .Cells(1, 1) = "Subject"
        .Cells(1, 2) = "Received"
        .Cells(1, 3) = "Sender"
  End With
  intRow = 2
  'Write messages to spreadsheet
  For Each olkMsg In Application.ActiveExplorer.CurrentFolder.Items
     'Only export messages, not receipts or appointment requests, etc.
     If olkMsg.Class = olMail Then
        'Add a row for each field in the message you want to export
        excWks.Cells(intRow, 1) = olkMsg.Subject
        excWks.Cells(intRow, 2) = olkMsg.ReceivedTime
        excWks.Cells(intRow, 3) = GetSMTPAddress(olkMsg, intVersion)
        intRow = intRow + 1
     End If
     Set olkMsg = Nothing
     excWkb.SaveAs strFilename
  End If
  Set excWks = Nothing
  Set excWkb = Nothing
  Set excApp = Nothing
  MsgBox "Process complete.  A total of " & intRow - 2 & " messages were exported.", vbInformation + vbOKOnly, "Export messages to Excel"
End Sub
Private Function GetSMTPAddress(Item As Outlook.MailItem, intOutlookVersion As Integer) As String
  Dim olkSnd As Outlook.AddressEntry, olkEnt As Object
  On Error Resume Next
  Select Case intOutlookVersion
     Case Is < 14
        If Item.SenderEmailType = "EX" Then
           GetSMTPAddress = SMTP2007(Item)
           GetSMTPAddress = Item.SenderEmailAddress
        End If
     Case Else
        Set olkSnd = Item.Sender
        If olkSnd.AddressEntryUserType = olExchangeUserAddressEntry Then
           Set olkEnt = olkSnd.GetExchangeUser
           GetSMTPAddress = olkEnt.PrimarySmtpAddress
           GetSMTPAddress = Item.SenderEmailAddress
        End If
  End Select
  On Error GoTo 0
  Set olkPrp = Nothing
  Set olkSnd = Nothing
  Set olkEnt = Nothing
End Function

In order to get started, fire up your version of Outlook. I’m using a relatively old dinosaur version (Outlook 2003), but the steps can easily be found online for all versions. Most Windows versions should allow you to use Alt+11 to open the Visual Basic code editor, which we are going to fire up next. To do this, follow these steps:

Step 1: Click on Tools.
Step 2: Click on Macro.
Step 3: Click on Visual Basic Editor.

Next, we are going to copy and paste our code here into the editor window. Now, I used the revision 1 script and modified the original version to extract text from the body by coding the following lines. One after line 19, and one after line 29:

.Cells(1, 4) = “Message” <– This line tells the macro program to add another column to the first row that is labeled “Message”. This will add a new column that displays the text extracted from the email. This one was added after line 19.

.excWks.Cells(intRow, 4) = olkMsg.Body <– This line tells the macro program to extract the message text from the Body of the email. This way, we have an extremely easy and fast method of verifying all of our important conversion emails that we are going to be using in our reporting.

Now that we have our script ready, let’s go to the Visual Basic macro editor.

In the project window underneath the project, right-click within the window, click on insert, and then click on module. This will bring up a VbaProject.OTM file that you can add your code into, as shown in the following screenshot:

Once you have made your desired modifications (or if you desire to use the original script and copied and pasted it, just click on the floppy disk in the upper left hand corner and save the file. Or you can use Ctrl+S to save it. Then, close the Visual Basic editor.

Next, we’re going to run our newly modified macro! First, make sure the folder that you want is selected and all the leads you want to export to an excel spreadsheet are in that folder. Then, let’s click on Tools > Macro > Macros.

Next, you will see a Macros window pop up. We need to click on the macro we want to run, and then click on run.

True to the nature of the script, you will be prompted with a dialog box that asks you what you want to name your file. Let’s call it “ExcelExportTest”. It will save it into your My Documents folder. Fire up Excel, and open your brand new spreadsheet. Here is the final version of our example, complete with all extracted elements of that folder:


By using these methods, it is possible to greatly reduce the time that you spend on manually verifying and copying/pasting leads from your email box. It will be completely automated! Once you get the hang of using these methods, most of your time will be spent in the formatting phase that comes next. So, it will be necessary to spend this time adding some proper formatting that will help make your reports beautiful and impactful.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Moz Blog

Related Articles

Posted in Latest NewsComments Off

SEO Reporting To Shift Your Bottom Line

PPC & CRO are heavily report-driven and have excellent data visibility, and as a result are generally better understood to be a tool rather than an end in itself. SEO, however, has often fallen into the reporting-for-reporting’s-sake trap and there have been some surprising stand-out reports…

Please visit Search Engine Land for the full article.

Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing

Related Articles

Posted in Latest NewsComments Off