Tag Archive | "Googlebot"

Google spotted testing version of GoogleBot that can render more content

Google may be able to fully crawl more advanced and modern web apps sooner than you thought.



Please visit Search Engine Land for the full article.


Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing

Posted in Latest NewsComments Off

Does Googlebot Support HTTP/2? Challenging Google’s Indexing Claims – An Experiment

Posted by goralewicz

I was recently challenged with a question from a client, Robert, who runs a small PR firm and needed to optimize a client’s website. His question inspired me to run a small experiment in HTTP protocols. So what was Robert’s question? He asked…

Can Googlebot crawl using HTTP/2 protocols?

You may be asking yourself, why should I care about Robert and his HTTP protocols?

As a refresher, HTTP protocols are the basic set of standards allowing the World Wide Web to exchange information. They are the reason a web browser can display data stored on another server. The first was initiated back in 1989, which means, just like everything else, HTTP protocols are getting outdated. HTTP/2 is one of the latest versions of HTTP protocol to be created to replace these aging versions.

So, back to our question: why do you, as an SEO, care to know more about HTTP protocols? The short answer is that none of your SEO efforts matter or can even be done without a basic understanding of HTTP protocol. Robert knew that if his site wasn’t indexing correctly, his client would miss out on valuable web traffic from searches.

The hype around HTTP/2

HTTP/1.1 is a 17-year-old protocol (HTTP 1.0 is 21 years old). Both HTTP 1.0 and 1.1 have limitations, mostly related to performance. When HTTP/1.1 was getting too slow and out of date, Google introduced SPDY in 2009, which was the basis for HTTP/2. Side note: Starting from Chrome 53, Google decided to stop supporting SPDY in favor of HTTP/2.

HTTP/2 was a long-awaited protocol. Its main goal is to improve a website’s performance. It’s currently used by 17% of websites (as of September 2017). Adoption rate is growing rapidly, as only 10% of websites were using HTTP/2 in January 2017. You can see the adoption rate charts here. HTTP/2 is getting more and more popular, and is widely supported by modern browsers (like Chrome or Firefox) and web servers (including Apache, Nginx, and IIS).

Its key advantages are:

  • Multiplexing: The ability to send multiple requests through a single TCP connection.
  • Server push: When a client requires some resource (let’s say, an HTML document), a server can push CSS and JS files to a client cache. It reduces network latency and round-trips.
  • One connection per origin: With HTTP/2, only one connection is needed to load the website.
  • Stream prioritization: Requests (streams) are assigned a priority from 1 to 256 to deliver higher-priority resources faster.
  • Binary framing layer: HTTP/2 is easier to parse (for both the server and user).
  • Header compression: This feature reduces overhead from plain text in HTTP/1.1 and improves performance.

For more information, I highly recommend reading “Introduction to HTTP/2” by Surma and Ilya Grigorik.

All these benefits suggest pushing for HTTP/2 support as soon as possible. However, my experience with technical SEO has taught me to double-check and experiment with solutions that might affect our SEO efforts.

So the question is: Does Googlebot support HTTP/2?

Google’s promises

HTTP/2 represents a promised land, the technical SEO oasis everyone was searching for. By now, many websites have already added HTTP/2 support, and developers don’t want to optimize for HTTP/1.1 anymore. Before I could answer Robert’s question, I needed to know whether or not Googlebot supported HTTP/2-only crawling.

I was not alone in my query. This is a topic which comes up often on Twitter, Google Hangouts, and other such forums. And like Robert, I had clients pressing me for answers. The experiment needed to happen. Below I’ll lay out exactly how we arrived at our answer, but here’s the spoiler: it doesn’t. Google doesn’t crawl using the HTTP/2 protocol. If your website uses HTTP/2, you need to make sure you continue to optimize the HTTP/1.1 version for crawling purposes.

The question

It all started with a Google Hangouts in November 2015.

When asked about HTTP/2 support, John Mueller mentioned that HTTP/2-only crawling should be ready by early 2016, and he also mentioned that HTTP/2 would make it easier for Googlebot to crawl pages by bundling requests (images, JS, and CSS could be downloaded with a single bundled request).

“At the moment, Google doesn’t support HTTP/2-only crawling (…) We are working on that, I suspect it will be ready by the end of this year (2015) or early next year (2016) (…) One of the big advantages of HTTP/2 is that you can bundle requests, so if you are looking at a page and it has a bunch of embedded images, CSS, JavaScript files, theoretically you can make one request for all of those files and get everything together. So that would make it a little bit easier to crawl pages while we are rendering them for example.”

Soon after, Twitter user Kai Spriestersbach also asked about HTTP/2 support:

His clients started dropping HTTP/1.1 connections optimization, just like most developers deploying HTTP/2, which was at the time supported by all major browsers.

After a few quiet months, Google Webmasters reignited the conversation, tweeting that Google won’t hold you back if you’re setting up for HTTP/2. At this time, however, we still had no definitive word on HTTP/2-only crawling. Just because it won’t hold you back doesn’t mean it can handle it — which is why I decided to test the hypothesis.

The experiment

For months as I was following this online debate, I still received questions from our clients who no longer wanted want to spend money on HTTP/1.1 optimization. Thus, I decided to create a very simple (and bold) experiment.

I decided to disable HTTP/1.1 on my own website (https://goralewicz.com) and make it HTTP/2 only. I disabled HTTP/1.1 from March 7th until March 13th.

If you’re going to get bad news, at the very least it should come quickly. I didn’t have to wait long to see if my experiment “took.” Very shortly after disabling HTTP/1.1, I couldn’t fetch and render my website in Google Search Console; I was getting an error every time.

My website is fairly small, but I could clearly see that the crawling stats decreased after disabling HTTP/1.1. Google was no longer visiting my site.

While I could have kept going, I stopped the experiment after my website was partially de-indexed due to “Access Denied” errors.

The results

I didn’t need any more information; the proof was right there. Googlebot wasn’t supporting HTTP/2-only crawling. Should you choose to duplicate this at home with our own site, you’ll be happy to know that my site recovered very quickly.

I finally had Robert’s answer, but felt others may benefit from it as well. A few weeks after finishing my experiment, I decided to ask John about HTTP/2 crawling on Twitter and see what he had to say.

(I love that he responds.)

Knowing the results of my experiment, I have to agree with John: disabling HTTP/1 was a bad idea. However, I was seeing other developers discontinuing optimization for HTTP/1, which is why I wanted to test HTTP/2 on its own.

For those looking to run their own experiment, there are two ways of negotiating a HTTP/2 connection:

1. Over HTTP (unsecure) – Make an HTTP/1.1 request that includes an Upgrade header. This seems to be the method to which John Mueller was referring. However, it doesn’t apply to my website (because it’s served via HTTPS). What is more, this is an old-fashioned way of negotiating, not supported by modern browsers. Below is a screenshot from Caniuse.com:

2. Over HTTPS (secure) – Connection is negotiated via the ALPN protocol (HTTP/1.1 is not involved in this process). This method is preferred and widely supported by modern browsers and servers.

A recent announcement: The saga continues

Googlebot doesn’t make HTTP/2 requests

Fortunately, Ilya Grigorik, a web performance engineer at Google, let everyone peek behind the curtains at how Googlebot is crawling websites and the technology behind it:

If that wasn’t enough, Googlebot doesn’t support the WebSocket protocol. That means your server can’t send resources to Googlebot before they are requested. Supporting it wouldn’t reduce network latency and round-trips; it would simply slow everything down. Modern browsers offer many ways of loading content, including WebRTC, WebSockets, loading local content from drive, etc. However, Googlebot supports only HTTP/FTP, with or without Transport Layer Security (TLS).

Googlebot supports SPDY

During my research and after John Mueller’s feedback, I decided to consult an HTTP/2 expert. I contacted Peter Nikolow of Mobilio, and asked him to see if there were anything we could do to find the final answer regarding Googlebot’s HTTP/2 support. Not only did he provide us with help, Peter even created an experiment for us to use. Its results are pretty straightforward: Googlebot does support the SPDY protocol and Next Protocol Navigation (NPN). And thus, it can’t support HTTP/2.

Below is Peter’s response:


I performed an experiment that shows Googlebot uses SPDY protocol. Because it supports SPDY + NPN, it cannot support HTTP/2. There are many cons to continued support of SPDY:

    1. This protocol is vulnerable
    2. Google Chrome no longer supports SPDY in favor of HTTP/2
    3. Servers have been neglecting to support SPDY. Let’s examine the NGINX example: from version 1.95, they no longer support SPDY.
    4. Apache doesn’t support SPDY out of the box. You need to install mod_spdy, which is provided by Google.

To examine Googlebot and the protocols it uses, I took advantage of s_server, a tool that can debug TLS connections. I used Google Search Console Fetch and Render to send Googlebot to my website.

Here’s a screenshot from this tool showing that Googlebot is using Next Protocol Navigation (and therefore SPDY):

I’ll briefly explain how you can perform your own test. The first thing you should know is that you can’t use scripting languages (like PHP or Python) for debugging TLS handshakes. The reason for that is simple: these languages see HTTP-level data only. Instead, you should use special tools for debugging TLS handshakes, such as s_server.

Type in the console:

sudo openssl s_server -key key.pem -cert cert.pem -accept 443 -WWW -tlsextdebug -state -msg
sudo openssl s_server -key key.pem -cert cert.pem -accept 443 -www -tlsextdebug -state -msg

Please note the slight (but significant) difference between the “-WWW” and “-www” options in these commands. You can find more about their purpose in the s_server documentation.

Next, invite Googlebot to visit your site by entering the URL in Google Search Console Fetch and Render or in the Google mobile tester.

As I wrote above, there is no logical reason why Googlebot supports SPDY. This protocol is vulnerable; no modern browser supports it. Additionally, servers (including NGINX) neglect to support it. It’s just a matter of time until Googlebot will be able to crawl using HTTP/2. Just implement HTTP 1.1 + HTTP/2 support on your own server (your users will notice due to faster loading) and wait until Google is able to send requests using HTTP/2.


Summary

In November 2015, John Mueller said he expected Googlebot to crawl websites by sending HTTP/2 requests starting in early 2016. We don’t know why, as of October 2017, that hasn’t happened yet.

What we do know is that Googlebot doesn’t support HTTP/2. It still crawls by sending HTTP/ 1.1 requests. Both this experiment and the “Rendering on Google Search” page confirm it. (If you’d like to know more about the technology behind Googlebot, then you should check out what they recently shared.)

For now, it seems we have to accept the status quo. We recommended that Robert (and you readers as well) enable HTTP/2 on your websites for better performance, but continue optimizing for HTTP/ 1.1. Your visitors will notice and thank you.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!


Moz Blog

Posted in Latest NewsComments Off

Google Shares Details About the Technology Behind Googlebot

Posted by goralewicz

Crawling and indexing has been a hot topic over the last few years. As soon as Google launched Google Panda, people rushed to their server logs and crawling stats and began fixing their index bloat. All those problems didn’t exist in the “SEO = backlinks” era from a few years ago. With this exponential growth of technical SEO, we need to get more and more technical. That being said, we still don’t know how exactly Google crawls our websites. Many SEOs still can’t tell the difference between crawling and indexing.

The biggest problem, though, is that when we want to troubleshoot indexing problems, the only tool in our arsenal is Google Search Console and the Fetch and Render tool. Once your website includes more than HTML and CSS, there’s a lot of guesswork into how your content will be indexed by Google. This approach is risky, expensive, and can fail multiple times. Even when you discover the pieces of your website that weren’t indexed properly, it’s extremely difficult to get to the bottom of the problem and find the fragments of code responsible for the indexing problems.

Fortunately, this is about to change. Recently, Ilya Grigorik from Google shared one of the most valuable insights into how crawlers work:

Interestingly, this tweet didn’t get nearly as much attention as I would expect.

So what does Ilya’s revelation in this tweet mean for SEOs?

Knowing that Chrome 41 is the technology behind the Web Rendering Service is a game-changer. Before this announcement, our only solution was to use Fetch and Render in Google Search Console to see our page rendered by the Website Rendering Service (WRS). This means we can troubleshoot technical problems that would otherwise have required experimenting and creating staging environments. Now, all you need to do is download and install Chrome 41 to see how your website loads in the browser. That’s it.

You can check the features and capabilities that Chrome 41 supports by visiting Caniuse.com or Chromestatus.com (Googlebot should support similar features). These two websites make a developer’s life much easier.

Even though we don’t know exactly which version Ilya had in mind, we can find Chrome’s version used by the WRS by looking at the server logs. It’s Chrome 41.0.2272.118.

It will be updated sometime in the future

Chrome 41 was created two years ago (in 2015), so it’s far removed from the current version of the browser. However, as Ilya Grigorik said, an update is coming:

I was lucky enough to get Ilya Grigorik to read this article before it was published, and he provided a ton of valuable feedback on this topic. He mentioned that they are hoping to have the WRS updated by 2018. Fingers crossed!

Google uses Chrome 41 for rendering. What does that mean?

We now have some interesting information about how Google renders websites. But what does that mean, practically, for site developers and their clients? Does this mean we can now ignore server-side rendering and deploy client-rendered, JavaScript-rich websites?

Not so fast. Here is what Ilya Grigorik had to say in response to this question:

We now know WRS’ capabilities for rendering JavaScript and how to debug them. However, remember that not all crawlers support Javascript crawling, etc. Also, as of today, JavaScript crawling is only supported by Google and Ask (Ask is most likely powered by Google). Even if you don’t care about social media or search engines other than Google, one more thing to remember is that even with Chrome 41, not all JavaScript frameworks can be indexed by Google (read more about JavaScript frameworks crawling and indexing). This lets us troubleshoot and better diagnose problems.

For example, we ran into a problem with indexing Polymer’s generated content. Ilya Grigorik provided insight on how to deal with such issues in our experiment (below). We used this feedback to make http://jsseo.expert/polymer/ indexable — it now works fine in Chrome 41 and indexes properly.

“If you look at the raised Javascript error under the hood, the test page is throwing an error due to unsupported (in M41) ES6 syntax. You can test this yourself in M41, or use the debug snippet we provided in the blog post to log the error into the DOM to see it.”

I believe this is another powerful tool for web developers willing to make their JavaScript websites indexable.

If you want to see a live example, open http://jsseo.expert/angular2-bug/ in Chrome 41 and use the Chrome Developer Tools to play with JavaScript troubleshooting (screenshot below):

Fetch and Render is the Chrome v. 41 preview

There’s another interesting thing about Chrome 41. Google Search Console’s Fetch and Render tool is simply the Chrome 41 preview. The righthand-side view (“This is how a visitor to your website would have seen the page”) is generated by the Google Search Console bot, which is… Chrome 41.0.2272.118 (see screenshot below).

Zoom in here

There’s evidence that both Googlebot and Google Search Console Bot render pages using Chrome 41. Still, we don’t exactly know what the differences between them are. One noticeable difference is that the Google Search Console bot doesn’t respect the robots.txt file. There may be more, but for the time being, we’re not able to point them out.

Chrome 41 vs Fetch as Google: A word of caution

Chrome 41 is a great tool for debugging Googlebot. However, sometimes (not often) there’s a situation in which Chrome 41 renders a page properly, but the screenshots from Google Fetch and Render suggest that Google can’t handle the page. It could be caused by CSS animations and transitions, Googlebot timeouts, or the usage of features that Googlebot doesn’t support. Let me show you an example.

Chrome 41 preview:

Image blurred for privacy

The above page has quite a lot of content and images, but it looks completely different in Google Search Console.

Google Search Console preview for the same URL:

As you can see, Google Search Console’s preview of this URL is completely different than what you saw on the previous screenshot (Chrome 41). All the content is gone and all we can see is the search bar.

From what we noticed, Google Search Console renders CSS a little bit different than Chrome 41. This doesn’t happen often, but as with most tools, we need to double check whenever possible.

This leads us to a question…

What features are supported by Googlebot and WRS?

According to the Rendering on Google Search guide:

  • Googlebot doesn’t support IndexedDB, WebSQL, and WebGL.
  • HTTP cookies and local storage, as well as session storage, are cleared between page loads.
  • All features requiring user permissions (like Notifications API, clipboard, push, device-info) are disabled.
  • Google can’t index 3D and VR content.
  • Googlebot only supports HTTP/1.1 crawling.

The last point is really interesting. Despite statements from Google over the last 2 years, Google still only crawls using HTTP/1.1.

No HTTP/2 support (still)

We’ve mostly been covering how Googlebot uses Chrome, but there’s another recent discovery to keep in mind.

There is still no support for HTTP/2 for Googlebot.

Since it’s now clear that Googlebot doesn’t support HTTP/2, this means that if your website supports HTTP/2, you can’t drop HTTP 1.1 optimization. Googlebot can crawl only using HTTP/1.1.

There were several announcements recently regarding Google’s HTTP/2 support. To read more about it, check out my HTTP/2 experiment here on the Moz Blog.

Via https://developers.google.com/search/docs/guides/r…

Googlebot’s future

Rumor has it that Chrome 59’s headless mode was created for Googlebot, or at least that it was discussed during the design process. It’s hard to say if any of this chatter is true, but if it is, it means that to some extent, Googlebot will “see” the website in the same way as regular Internet users.

This would definitely make everything simpler for developers who wouldn’t have to worry about Googlebot’s ability to crawl even the most complex websites.

Chrome 41 vs. Googlebot’s crawling efficiency

Chrome 41 is a powerful tool for debugging JavaScript crawling and indexing. However, it’s crucial not to jump on the hype train here and start launching websites that “pass the Chrome 41 test.”

Even if Googlebot can “see” our website, there are many other factors that will affect your site’s crawling efficiency. As an example, we already have proof showing that Googlebot can crawl and index JavaScript and many JavaScript frameworks. It doesn’t mean that JavaScript is great for SEO. I gathered significant evidence showing that JavaScript pages aren’t crawled even half as effectively as HTML-based pages.

In summary

Ilya Grigorik’s tweet sheds more light on how Google crawls pages and, thanks to that, we don’t have to build experiments for every feature we’re testing — we can use Chrome 41 for debugging instead. This simple step will definitely save a lot of websites from indexing problems, like when Hulu.com’s JavaScript SEO backfired.

It’s safe to assume that Chrome 41 will now be a part of every SEO’s toolset.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!


Moz Blog

Posted in Latest NewsComments Off

Optimizing AngularJS Single-Page Applications for Googlebot Crawlers

Posted by jrridley

It’s almost certain that you’ve encountered AngularJS on the web somewhere, even if you weren’t aware of it at the time. Here’s a list of just a few sites using Angular:

  • Upwork.com
  • Freelancer.com
  • Udemy.com
  • Youtube.com

Any of those look familiar? If so, it’s because AngularJS is taking over the Internet. There’s a good reason for that: Angular- and other React-style frameworks make for a better user and developer experience on a site. For background, AngularJS and ReactJS are part of a web design movement called single-page applications, or SPAs. While a traditional website loads each individual page as the user navigates the site, including calls to the server and cache, loading resources, and rendering the page, SPAs cut out much of the back-end activity by loading the entire site when a user first lands on a page. Instead of loading a new page each time you click on a link, the site dynamically updates a single HTML page as the user interacts with the site.

image001.png

Image c/o Microsoft

Why is this movement taking over the Internet? With SPAs, users are treated to a screaming fast site through which they can navigate almost instantaneously, while developers have a template that allows them to customize, test, and optimize pages seamlessly and efficiently. AngularJS and ReactJS use advanced Javascript templates to render the site, which means the HTML/CSS page speed overhead is almost nothing. All site activity runs behind the scenes, out of view of the user.

Unfortunately, anyone who’s tried performing SEO on an Angular or React site knows that the site activity is hidden from more than just site visitors: it’s also hidden from web crawlers. Crawlers like Googlebot rely heavily on HTML/CSS data to render and interpret the content on a site. When that HTML content is hidden behind website scripts, crawlers have no website content to index and serve in search results.

Of course, Google claims they can crawl Javascript (and SEOs have tested and supported this claim), but even if that is true, Googlebot still struggles to crawl sites built on a SPA framework. One of the first issues we encountered when a client first approached us with an Angular site was that nothing beyond the homepage was appearing in the SERPs. ScreamingFrog crawls uncovered the homepage and a handful of other Javascript resources, and that was it.

SF Javascript.png

Another common issue is recording Google Analytics data. Think about it: Analytics data is tracked by recording pageviews every time a user navigates to a page. How can you track site analytics when there’s no HTML response to trigger a pageview?

After working with several clients on their SPA websites, we’ve developed a process for performing SEO on those sites. By using this process, we’ve not only enabled SPA sites to be indexed by search engines, but even to rank on the first page for keywords.

5-step solution to SEO for AngularJS

  1. Make a list of all pages on the site
  2. Install Prerender
  3. “Fetch as Google”
  4. Configure Analytics
  5. Recrawl the site

1) Make a list of all pages on your site

If this sounds like a long and tedious process, that’s because it definitely can be. For some sites, this will be as easy as exporting the XML sitemap for the site. For other sites, especially those with hundreds or thousands of pages, creating a comprehensive list of all the pages on the site can take hours or days. However, I cannot emphasize enough how helpful this step has been for us. Having an index of all pages on the site gives you a guide to reference and consult as you work on getting your site indexed. It’s almost impossible to predict every issue that you’re going to encounter with an SPA, and if you don’t have an all-inclusive list of content to reference throughout your SEO optimization, it’s highly likely you’ll leave some part of the site un-indexed by search engines inadvertently.

One solution that might enable you to streamline this process is to divide content into directories instead of individual pages. For example, if you know that you have a list of storeroom pages, include your /storeroom/ directory and make a note of how many pages that includes. Or if you have an e-commerce site, make a note of how many products you have in each shopping category and compile your list that way (though if you have an e-commerce site, I hope for your own sake you have a master list of products somewhere). Regardless of what you do to make this step less time-consuming, make sure you have a full list before continuing to step 2.

2) Install Prerender

Prerender is going to be your best friend when performing SEO for SPAs. Prerender is a service that will render your website in a virtual browser, then serve the static HTML content to web crawlers. From an SEO standpoint, this is as good of a solution as you can hope for: users still get the fast, dynamic SPA experience while search engine crawlers can identify indexable content for search results.

Prerender’s pricing varies based on the size of your site and the freshness of the cache served to Google. Smaller sites (up to 250 pages) can use Prerender for free, while larger sites (or sites that update constantly) may need to pay as much as $ 200+/month. However, having an indexable version of your site that enables you to attract customers through organic search is invaluable. This is where that list you compiled in step 1 comes into play: if you can prioritize what sections of your site need to be served to search engines, or with what frequency, you may be able to save a little bit of money each month while still achieving SEO progress.

3) “Fetch as Google”

Within Google Search Console is an incredibly useful feature called “Fetch as Google.” “Fetch as Google” allows you to enter a URL from your site and fetch it as Googlebot would during a crawl. “Fetch” returns the HTTP response from the page, which includes a full download of the page source code as Googlebot sees it. “Fetch and Render” will return the HTTP response and will also provide a screenshot of the page as Googlebot saw it and as a site visitor would see it.

This has powerful applications for AngularJS sites. Even with Prerender installed, you may find that Google is still only partially displaying your website, or it may be omitting key features of your site that are helpful to users. Plugging the URL into “Fetch as Google” will let you review how your site appears to search engines and what further steps you may need to take to optimize your keyword rankings. Additionally, after requesting a “Fetch” or “Fetch and Render,” you have the option to “Request Indexing” for that page, which can be handy catalyst for getting your site to appear in search results.

4) Configure Google Analytics (or Google Tag Manager)

As I mentioned above, SPAs can have serious trouble with recording Google Analytics data since they don’t track pageviews the way a standard website does. Instead of the traditional Google Analytics tracking code, you’ll need to install Analytics through some kind of alternative method.

One method that works well is to use the Angulartics plugin. Angulartics replaces standard pageview events with virtual pageview tracking, which tracks the entire user navigation across your application. Since SPAs dynamically load HTML content, these virtual pageviews are recorded based on user interactions with the site, which ultimately tracks the same user behavior as you would through traditional Analytics. Other people have found success using Google Tag Manager “History Change” triggers or other innovative methods, which are perfectly acceptable implementations. As long as your Google Analytics tracking records user interactions instead of conventional pageviews, your Analytics configuration should suffice.

5) Recrawl the site

After working through steps 1–4, you’re going to want to crawl the site yourself to find those errors that not even Googlebot was anticipating. One issue we discovered early with a client was that after installing Prerender, our crawlers were still running into a spider trap:

As you can probably tell, there were not actually 150,000 pages on that particular site. Our crawlers just found a recursive loop that kept generating longer and longer URL strings for the site content. This is something we would not have found in Google Search Console or Analytics. SPAs are notorious for causing tedious, inexplicable issues that you’ll only uncover by crawling the site yourself. Even if you follow the steps above and take as many precautions as possible, I can still almost guarantee you will come across a unique issue that can only be diagnosed through a crawl.

If you’ve come across any of these unique issues, let me know in the comments! I’d love to hear what other issues people have encountered with SPAs.

Results

As I mentioned earlier in the article, the process outlined above has enabled us to not only get client sites indexed, but even to get those sites ranking on first page for various keywords. Here’s an example of the keyword progress we made for one client with an AngularJS site:

Also, the organic traffic growth for that client over the course of seven months:

All of this goes to show that although SEO for SPAs can be tedious, laborious, and troublesome, it is not impossible. Follow the steps above, and you can have SEO success with your single-page app website.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!


Moz Blog

Related Articles

Posted in Latest NewsComments Off

Search In Pics: GoogleBot In Snow, Google Pet Toys & John Mueller In Star Wars

In this week’s Search In Pictures, here are the latest images culled from the web, showing what people eat at the search engine companies, how they play, who they meet, where they speak, what toys they have and more. World’s Largest Android Marshmallow Mosaic: Google John Mueller Star…



Please visit Search Engine Land for the full article.


Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing

Posted in Latest NewsComments Off

SearchCap: Suing SEOs, Googlebot Render Page & Top AdWords Impressions

Below is what happened in search today, as reported on Search Engine Land and from other places across the web. From Search Engine Land: Adapting to Change, Harnessing New Trends, Eyeing the Future ­ Search Engine Land Summit You deliver results daily, but staying successful means identifying…



Please visit Search Engine Land for the full article.


Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing

Find More Articles

Posted in Latest NewsComments Off


Advert