In the first post in this series, I talked about how relatively few URLs on the web are currently clearing the double-hurdle required for a maximum CWV (Core Web Vitals) ranking boost:
Passing the threshold for all three CWV metrics
Actually having CrUX data available so Google knows you’ve passed said thresholds
For Google’s original rollout timeline in May, we would have had 9% of URLs clearing this bar. By August 2021, this had hit 14%.
This alone may have been enough for Google to delay, downplay, and dilute their own update. But there’s another critical issue that I believe may have undermined Google’s ability to introduce Page Experience as a major ranking factor: flimsy metrics.
Flimsy metrics
It’s a challenging brief to capture the frustrations of millions of disparate users’ experiences with a handful of simple metrics. Perhaps an impossible one. In any case, Google’s choices are certainly not without their quirks. My principle charge is that many frustrating website behaviors are not only left unnoticed by the three new metrics, but actively incentivized.
To be clear, I’m sure experience as measured by CWV is broadly correlated with good page experience. But the more room for maneuver there is, and the fuzzier the data is, the less weight Google can apply to page experience as a ranking factor. If I can be accused of holding Google up to an unrealistic standard here, then I’d view that as a bed of their own making.
Largest Contentful Paint (LCP)
This perhaps feels the safest of the three new metrics, being essentially a proxy for page loading speed. Specifically, though, it measures the time taken for the largest element to finish loading. That “largest element” is the bit that raises all manner of issues.
Take a look at the Moz Blog homepage, for example. Here’s a screenshot from a day close to the original, planned CWV launch:
What would you say is the largest element here? The hero images perhaps? The blog post titles, or blurbs?
For real world data in the CrUX dataset, of course, the largest element may vary by device type. But for a standard smartphone user agent (Moz Pro uses a Moto G4 as its mobile user agent), it’s the passage at the top (“The industry’s top wizards, doctors, and other experts…”). On desktop, it’s sometimes the page titles — depending on what the length of the two most recent titles happens to be. Of course, that’s part of the catch here: you have to remember to take a look with the right device. But even if you do, it’s not exactly obvious.
(If you don’t believe me, you can set up a campaign for Moz.com in Moz Pro, and check for yourself in the Performance Metrics feature within the Site Crawl tool.)
There are two reasons this ends up being a particularly unhelpful comparison metric.
1. Pages have very different structures
The importance of the “largest element” varies hugely from one page to another. Sometimes, it’s an insignificant text block, like with Moz above. Sometimes it’s the actual main feature of the page. Sometimes it’s a cookie overlay, like this example from Ebuyer:
This becomes a rather unfair, apples to oranges, comparison, and encourages focusing on arbitrary elements in many cases.
2. Easy manipulation
When the largest few elements are similar in size (as with Moz above), there’s an incentive to make the quickest one just a bit larger. This has no real improvement to user experience, but will improve LCP.
First Input Delay (FID)
First Input Delay is a much less intuitive metric. This records the amount of time it takes to process the user’s first interaction (counting clicks on interactive elements, but not scrolls or zooms) from when the browser starts to process that interaction. So the actual time taken to finish processing is irrelevant — it’s just the delay between a user action and the start of processing.
Naturally, if the user tries to click something whilst the page is still loading, this lag will be considerable. On the other hand, if that click happens much later, it’s likely the page will be in a good position to respond quickly.
The incentive here, then, is to delay the user’s first click. Although this is counterintuitive, it can actually be a good thing, because it pushes us away from having pop-ups and other elements that block access to content. However, if we really wanted to be cynical, then we could actually optimize for this metric by making elements harder to click, or initially non-interactive. By making navigation elements a more frustrating experience, we would buy time for the page to finish loading.
On top of this, it’s worth remembering that FID cannot be measured in the lab, because it requires that human element. Instead, Moz Pro and other lab suites (including Google’s) use Total Blocking Time, which is closer to approximating what would happen if a user immediately tried to click something.
Overall, I think this metric isn’t as unfair a comparison as Largest Contentful Paint, because gaming the system here is slightly more of a shot in one’s own foot. It’s still potentially an unfair comparison, in that navigational pages will have a harder time than content pages (because on a navigational, hub, or category page, users want to click quite soon). But it could be argued that navigation pages are worse search results anyway, so perhaps, giving Google an XXL serving of the benefit of the doubt, that could be deliberate.
Cumulative Layout Shift (CLS)
And lastly, there’s Cumulative Layout Shift, another metric which seems intuitively good — we all hate it when pages shift around whilst we’re trying to read or click something. The devil, though, is once again in the details, because CLS records the maximum change in a 5-second “session” window.
Ignoring the issue with the use of the word “session” that is confusingly nothing to do with Google’s definition of the same word in other contexts, the issue here is that some of the worst offenders for a jarring web experience won’t actually register on this metric.
Namely:
Mid-article adverts, social media embeds, and so on, are often below the fold, so have no impact whatsoever.
Annoying pop-ups and the like often arrive after a delay, so not during the 5-second window. (And, in any case, can be configured to not count towards layout shift!)
At MozCon earlier this year, I shared this example from the Guardian, which has zero impact on their (rather good) CLS score:
So in the best case, this metric is oblivious to the worst offenders of the kind of bad experience it is surely trying to capture. And in the worst case, it again could incentivize behavior that is actively bad. For example, I might delay some annoying element of my page so that it arrives outside of the initial 5-second window. This would make it even more annoying, but improve my score.
What next?
As I mentioned in part one, Google has been a bit hesitant and timid with the rollout of Core Web Vitals as a ranking factor, and issues like those I’ve covered here might be part of the reason why. In future, we should expect Google to keep tweaking these metrics, and to add new ones.
Indeed, Google themselves said last May that they planned to incorporate more signals on a yearly basis, and improvements to responsiveness metrics are being openly discussed. This ultimately means you shouldn’t try to over-optimize, or cynically manipulate the current metrics — you’re likely to suffer for it down the line.
As I mentioned in the first article in this series, if you’re curious about where you stand for your site’s CWV thresholds today, Moz has a tool for it currently in beta with the official launch coming later this year.
In the third and final part of this series, we’ll cover the impact of CWV on rankings so far, so we can see together how much attention to pay to the various “tiebreaker” equivocations.
First, let me pull up the best seat in the house for you: the local business owner or marketer who has weathered so much in the past two years. For your work of serving the public, you deserve the comfy chair by the fire, the celebratory cup of hot chocolate, while we chat about preparing to take good care of your customers in the upcoming holiday season.
Thank you for how you’ve risen every day to countless challenges, kept communities supplied, and will even make some dreams come true when people give gifts to one another as expressions of love, hope, and generosity of spirit this winter.
We can prepare your local business to be both popular and profitable in the 2021 holiday shopping season by identifying and answering six types of scenario-specific customer questions, and strategizing where to publicize the answers. Comfortable? Here we go!
1. Do you have [x]?
It’s the most basic and obvious question at the start of every transaction, but the answer has become more complex in the past two years due to the pandemic and related supply chain issues. Customer satisfaction is now tied, more than ever, to simply communicating availability via the following methods:
Best-in-class e-commerce systems should warn customers if local inventory levels are low or items are out of stock. If your solution is lacking features, it’s a signal an upgrade may be necessary to keep customers happy.
Add your products for free to Google’s Merchant Center and be sure you’re keeping a good eye on them for accuracy.
If you’re selling one of those Cabbage-Patch-Tickle-Me-Elmo-iPhone-hot items, definitely consider investing in video sales. Even if your stock is less trendy, I predict we’re on our way to a QVC-like commerce future and this is a great time to experiment with this form of sales.
Social commerce is on the rise, too, and if your customers shop on Instagram or Facebook, you should be there. Meanwhile, purely informational social posting can help you signal availability of desired merchandise.
And, of course, be sure every member of your staff is well-trained in and has access to an accurate inventory database so that walk-in, phone, chat, and text-based queries can be quickly and correctly answered.
2. How can I get [x]?
Once a customer has established that you have an item they want, the natural next question is how they will access it, and success now depends on offering multiple options. At-home local delivery by in-house or third-party drivers, curbside-pickup, and shipping make up the present norm alongside in-store pickup. In certain verticals, shoppers will also want to know if items can be bundled as a gift or gift-wrapped.
Accuracy and transparency are vital to setting expectations for these services and their attendant fees. And don’t forget to publish purchase-by dates to ensure holiday delivery! Publicize all of your fulfillment options here:
Product pages on your website
A holiday shopping guide on your website
Shipping and service pages on your website
Customer satisfaction guarantee pages on your website
Social posting can bring further attention to your convenient holiday services
Via your all-staff training, to be sure every team member knows how to communicate the many ways customers can access your inventory.
3. Where are you and when are you there?
It’s never been more important for your website and local business listings to contain accurate contact information and current hours of operation. Your customers will likely span a spectrum of those who are choosing to continue to shop in person and those who are firmly resolved to avoid public settings. A desire that unifies them all, however, is that of avoiding inconvenience in these difficult times caused by driving to a wrong address for an in-store or curbside pickup, calling a wrong number to place an order, or trying to make contact during incorrect stated hours of operation. Now is the time to be sure that:
Your local business listings across the web feature a correct name, address, phone number, text line, email address, and holiday hours (Moz Local can help make quick work of this for your business!)
You’ve done a complete review of all pages of your website to find the above information anywhere it’s mentioned and edited it for 2021 accuracy
Your social profiles reflect this, too.
You’ve reached out to websites, blogs, industry publications, news sites, and other sources of unstructured citations if their information about your business is outdated.
4. How much do you care?
The entirety of your COVID-19 safety precautions is a source of vital information for customers. Vaccine requirements, mask mandates, sanitation, contactless services, self-imposed closures, and all other public health proceedings should be communicated by any business desiring to prove that company leadership cares about staff and customers, alike.
In the US, our news and social media tend to focus on individuals flouting safety, but our real lives are filled with vulnerable loved ones; children with autoimmune diseases, elders at high risk, friends with asthma. The local brand you are promoting can evince its care for the whole community and let people make an informed choice about the security of shopping with you by publishing your pandemic safety measures. Here are some good options:
Create a COVID-19 policy page on your website and practice strong internal linking to it from relevant transactional and informational pages
Include a form for customers to report failures of staff or other shoppers to adhere to the published policy so that you can take steps to correct these instances
I am sincerely hoping Google will do the right thing and add “vaccination card required” to their available attributes in the GMB dashboard. In the meantime, be sure you’ve selected as many attributes as are applicable to your policy in the Health & Safety section of the dashboard so that these appear on your listing.
Your Google My Business description is a great place to summarize your public safety policy.
Your policies can also be good topics for Google Posts.
Put your policies into question form. Do you require masks, is your staff vaccinated, do you require proof of vaccination, and similar questions are ones you can publish and answer in the Q&A section of your GMB listing.
Use social media to further disseminate your policy.
Get in touch with local reporters to get them writing about the positive side of businesses like yours taking as many steps as possible to keep people safer.
5. How well are you listening?
Real-world service and online reputation are inextricably linked. Near Media’s Mike Blumenthal recently published an important study of how Walmart’s online review counts went up and ratings went down in conjunction with customer disappointment over inventory shortages. Unfortunately, supply chain problems can be completely beyond a local business’ control, but what is almost always achievable is communication with customers who are taking the time to complain.
The Near Media report raised a big question for me: whether Walmart was responding to negative reviews related to shortages. I looked at the store nearest me for an answer. Sure enough, “shelves” was one of the top Place Topics trending for this location, but reviews like this one had received no response, weeks or months after publication:
67% of consumers say they are more committed to shopping small than they were pre-pandemic, and 91% say they prefer SMBs because they trust these businesses to treat them fairly. Any local business you’re marketing can do a better job than Walmart of proving that you are listening to customers’ needs and concerns simply by responding to their reviews as quickly and compassionately as possible.
Note that surveys don’t indicate customers expect perfection from local businesses; they expect fairness, and fairness starts with listening well. Good communication in return can reassure a customer that you care, and can even inspire them to edit a negative review to express an improved opinion of your customer service, even when something has gone wrong. Your review corpus, complete with owner responses, provides a constant, ready answer to potential customers who want to know how well they can expect to be treated by your brand.
Due to supply chain issues, this holiday shopping season will not be an easy one, but if you are using software like Moz Local to alert you to incoming reviews across multiple platforms and are coupling this with social listening for negative brand mentions, your transparency and responsiveness will go far towards keeping customers on your side and satisfied while also safeguarding your reputation.
6. Is there affinity?
If you’re acing questions 1-5, you’ve made sure that customers know what you have, where, when, and how to access it, the care you’re taking in regards to public safety, and the responsiveness of your customer service. You have one more major opportunity to persuade people that choosing you is the right choice for them, and this lies in publicizing the work you are doing to demonstrate affinity with the culture and needs of the communities you serve.
It speaks to the remarkable resilience of the human spirit that, even in the midst of crisis, many of your customers are continuing to actively advocate for solutions to local, national, and global problems.
Every part of the world is now being impacted by Climate Change, for example, and I’ve watched with admiration for the past several years how Irish media is making the transition via print, radio, television, and online marketing to promote more sustainable holidays. Meanwhile, Sweden has opened the world’s first second-hand mall, Peru is answering the 75% increase in searches for sustainable clothing by leading the fight against fast fashion, and France has outlawed planned obsolescence and is fining companies which design products intended to break. The US is also part of the 71% increase in searches for green goods over the past five years, and if your local business has made the commitment to being part of the essential change to protect the planet, what you publish about your activities can help you and your customers make the journey together.
For other customers, lived experience and allyship could be making other issues top of mind. Racial and gender equity, human or animal rights, localism, LGBTQ+ advocacy, the cure of disease, support for elders, the differently-abled, children, or the chronically ill could all be worthy causes of great importance to community members. When your leadership and staff authentically share a commitment to progress on issues that matter most, businesses have a role to play in the ongoing work and a story to tell that will have meaning to customers.
Your website, Google Posts, social profiles, local or national media, and industry publications are all excellent places to shine a light on your activism, advocacy, sponsorship, and philanthropy. The core goal of such work should be to move important causes forward by showing businesses can be part of necessary change. But an additional benefit of taking public stances can be winning new loyal customers not just for the 2021 holiday shopping season, but for life.
From my own longstanding and heartfelt affinity with local businesses everywhere, I am wishing you an excellent, inspired, and inspiring holiday season and a new year that sees your business and its community thriving!
Faced with so many SEO tasks to worry about, how do you know which ones to prioritize? In today’s episode of Whiteboard Friday, Ola takes you through the important technical SEO fixes that should be a the top of your audit list.
Click on the whiteboard image above to open a larger version in a new tab!
Video Transcription
Hi, Moz fans. I’m Ola King. I work here at Moz, and I’m excited to join you today for another edition of Whiteboard Friday. Today we’ll be talking about prioritizing SEO fixes.
By now, you are probably already familiar with the concept of technical SEO, and if you aren’t, there are a bunch of resources out there, including the Moz blog and other sites that you can check out.
But technical SEO is probably the most important part of SEO because if you have the best content on the web, you have the most backlinks, if your site is not technically sound, then you might not be able to get the best results from all of your efforts. So your technical SEO is really the foundation of everything else you do.
There’s a bunch of tools, like Sitebulb, Moz Pro, and Screaming Frog, that can allow you to analyze your technical SEO issues involved. But once you have the list of the issues, you might not always know how to prioritize your effort. So today’s session is to help you have a better handle on that once you have a list of those issues.
Indexing
For your technical SEO, the first thing the search engines will do towards your website is really to crawl it. So you want to make sure that you have your sitemap set up correctly, and then you have to make sure that your robots.txt is set up correctly so that the search engines can crawl your most relevant pages on your site.
Pages to index
But once they’ve crawled your site, you want to ensure that the crawl budget is spent towards indexing the relevant pages on your site. So today we’ll cover the pages that you should be indexing, and we’ll also then talk about how you can fix those pages or how to prioritize your efforts towards fixing those pages.
So the first thing, index. The pages that you should be indexing are really pages that are important towards your business. So what’s your KPI, what’s going to drive leads to your site, what’s going to drive traffic, or are there strategic pages that you’re trying to get more results from, start with that. Know what those are and those are the things that you want to make sure that are indexed.
Of course, the larger your website, the more you have to pay attention to these. Maybe on your own, when it’s a brand-new site, it’s okay to index everything initially. But as your site grows, you might want to be more careful with that. Of course, you don’t want to index all the private and sensitive pages. So think of your login page, privacy policy pages.
They should exist on your site, of course, but they don’t need to be indexed. So you want to ensure that meta no-index is set up for those pages, and you can do that from the source by setting up your robots.txt to ensure that those pages are not even crawled in the first place. So the pages that you should prioritize, in terms of indexing for your site, are the high traffic value pages.
So these are pages that either you want to get more traffic towards or are already getting lots of traffic, and they could also be pages that might not be getting lots of traffic, but they are strategic in terms of they bring quality traffic to your website or you expect them to bring quality traffic to your site. Then high links value.
You want to ensure that the pages that are positioned on your website to drive links are indexed. Or if they are already driving links, you don’t want to mess up that aspect. You can use, once again, tools like Moz Pro to analyze pages that are getting links on your site. So you can use the Link Explorer for that.
Make sure that you’re not messing up pages that are driving value to other pages as well. So even if they don’t have external links, if they are important in terms of internal links, they are important to you to index as well. High keyword value as well. If a page is getting a lot of your keywords, you want to ensure a page like that is indexed as well. The role in the user journey. Some pages might not have much SEO value, but they are still very useful to your user journey. So think of like your help pages. They probably won’t rank for a lot of keywords, to be honest.
However, if your customer goes to the search engine and searches for a solution to something, you want to direct them to those help pages, and you want to ensure that they are finding those easily. So not a lot of SEO value in the traditional sense, but it’s still very useful towards your user journey.
Also pages that are placed in a very prominent position on your site should be indexed as well, so your homepage. Most likely every link on your page should not lead to a dead end. It should lead customers towards the intended target, so no 404 errors on your homepage or other pages that are very important to your user journey. So a way to prevent that is you can set up a Google Analytics custom report so that any time there’s an issue on a page like that, you are alerted quickly.
So these are the pages that you should be indexing. If there’s something that you think should be here, please let me know. This is not meant to be a perfect list. But this is based on my experience so far, what I’ve found, but you might know something that I don’t. So please let me know.
Prioritizing fixes
So for the fixes, in terms of the fixes, this is how you should prioritize what you should fix.
Prioritization factors
So these are the prioritization factors. So once again:
Page value: pages that are ranking, that are getting links, that are getting clicks, you should prioritize fixing those quicker than other pages.
Pages that align with your priority or that would have impacts towards your KPI, you should probably prioritize those pages as well.
Ranking potential: There are some pages that might not be doing quite well, but they have high ranking potential. So if a page is like on page 2 and you know that it’s going after a keyword that is not so difficult, then you might want to prioritize fixing that page so that you can start getting results a lot quicker.
Then by issue type as well. Some issues are worth going after and fixing quicker than other issues.
The technical effort as well. Some issues are a lot easier to fix. If you’re not technical, this is not for you to determine. You might want to talk to your developers and for them to estimate how hard or easy a page would be to fix.
Prioritizing by issue type
If you’re technical, of course you can be the judge of that yourself. But in terms of the issues type, this is a list of the common issues and how to prioritize them:
So you have the critical crawler issues. So you could have a server error or a broken page, so the 4XX pages or errors. Make sure you fix those immediately because they are impacting your user journey and potential rankings as well.
The metadata issues as well. So things like missing descriptions, those are things you want to to fix next after the pages are already accessible.
Redirect issues as well. No one wants a redirect chain. So make sure that’s fixed after you fix your metadata issues.
Content issues are minor tweaks here and there. So they are important, but they’re very much low on your priority. So if you are finding like you don’t have the right keyword, for example, in your URL, that might not always be very important to fix right away because the efforts to fix that might not be worth the result in the sense of just fixing a URL issue would actually result in doing an audit to make sure that you’re not breaking anything else on your site. So for something that might not have a lot of impact, it is leading to a lot of work which is not super valuable to you.
Conclusion
So, yeah, these are the things that I consider when I’m fixing a site’s technical issues. There are still many steps to this. We are not able to cover all of that due to the time constraint.
But things like, for example, I mentioned the technical effort, so you might want to use things like T-shirt sizing, for example, to determine what issues are small, what are medium, large, extra large, and so on. Depending on your projects management tool, you might be able to set up a sorting function so that you are able to do this automatically.
So once you upload your URLs, for example, into a Google Sheet, you can be able to set up a script that allows you, once you’ve selected the effort, the issue type, and the value to you, you should be able to sort that automatically as well, so this work is a lot quicker. But yeah, I would like to know what you think about this, and I would love to learn from you as well.
It’s common to hear SEOs discuss the “increasing dominance” of big brands in SEO, and how smaller companies just can’t break into the rankings like they used to. Google even put out a “domain diversity” algorithm update a couple of years ago to address the issue, and people like me have shown time and time again how metrics like branded search volume and domain authority — typically signs of a big, well-known company — are key predictors of search performance.
This post, though, is to share some surprising data we’ve surfaced at Moz suggesting that, actually, right now is the best time in years to be an outsider in SEO.
What is domain diversity?
I think there are two appealing ways to define domain diversity, and we’ll look at data for both.
Normally, “domain diversity” is used to refer to a greater number of unique sites appearing in a given SERP. For example, if you have three results from Pinterest and two from eBay on the first page, that is an example of very low domain diversity. I’ll talk about this as “per-SERP” domain diversity below.
The other type of domain diversity I’d like to discuss is the extent to which the biggest sites dominate SEO in general. If the same few sites are ranking top 5 for any query you can think of, that doesn’t necessarily mean low per-SERP domain diversity, but it is still a very homogenous and inaccessible search landscape. I’ll talk about this as “overall” domain diversity, but this is also the domain diversity metric shown on MozCast.
Per-SERP domain diversity
For this measure, we’re using the percentage of page one results which are the first appearance of that subdomain. So for example, if the first page of results has 10 organic listings, of which eight are unique but two are en.wikpedia.org, that’s a score of 90% (i.e. 9/10). Another way of thinking about this same metric is the average ratio of unique subdomains (9) to total subdomains (10). The dataset used throughout is the MozCast corpus.
Here’s how that looks for the last four years, up to August 2021:
To my eye, this is almost incredibly level. For virtually the entirety of 2019, 2020, and 2021, we’ve hovered between 90 and 92%. That’s roughly equivalent to each SERP having one duplicated subdomain. I’ve also included the same statistic if we count sitelinks, which obviously is a little lower as sitelinks always repeat the main site domain, but this is still remarkably consistent.
There’s a five-day dip between October 4 and October 8, 2019, and a longer period of fluctuation starting June 28, 2018, but neither line up particularly closely with any major algorithm update or SERP feature change, and both were ultimately corrected.
Google’s own Site Diversity update on June 6, 2019 barely registers, with a 0.5% impact on the day:
For me though, the main story here is the incredible consistency. For this metric to have been so level for so many years, it feels very likely that this has been something Google has explicitly targeted.
Overall domain diversity
For overall diversity, rather than looking at the average ratio of unique subdomains to total subdomains per SERP, we’ve done it across every page one result in the corpus. So, for example, taking 100 results from 10 SERPs, a score of 30% would mean that there were 30 unique subdomains.
This chart has a bit more going on, but you can see that the current level of diversity is the highest in years (with the exception of a brief spike in 2020):
We last saw diversity this high in mid-2017, and we’ve not seen levels significantly higher since late 2016.
As with per-SERP diversity, much of the variance on this chart doesn’t line up with any known algorithm update. However, there are a couple of recent notable exceptions:
Two of the biggest one-day changes we’ve seen are associated with Google “glitches”. Some of the more gradual changes happen around known algorithm updates, but the biggest of all seemingly went totally unnoticed by the SEO industry.
Bonus: The Big 10
Another way of measuring domain diversity is by the percentage of results that are one of the 10 most common subdomains in MozCast. What features on this list varies over time, but for context, right now it looks like this:
This metric is also lower than it has been for a while, albeit not quite so extreme:
That big drop in 2018 coincides with the introduction of video carousels as non-organic results, heavily reducing YouTube’s presence as a big player. (It now sits just outside our top 10).
What does this mean for SEOs?
If you still feel that small brands are being crowded out in your industry, you could be right. We’ll be publishing more posts digging into narrower versions of this data — industries, specific big sites, and result types. Indeed, I’d love to hear what you’d like to see more of over on social media.
For now, though, the main takeaway is that the last few years — even with their focus on E-A-T and other factors we consider easier for bigger brands — have far from wiped out diversity in the SERPs. On the contrary, it’s rarely looked better.
No, please, do read on. This is a post about what has gone wrong with Core Web Vitals and where we stand now, but also why you still need to care. I also have some data along the way, showing how many sites are hitting the minimum level, both now and back at the original intended launch date.
At the time of writing, it’s nearly a year and a half since Google told us that they were once again going to pull their usual trick: tell us something is a ranking factor in advance, so that we improve the web. To be fair, it’s quite a noble goal all told (albeit one they have a significant stake in). It’s a well trodden playbook at this point, too, most notably with “mobilegeddon” and HTTPS in recent years.
Both of those recent examples felt a little underwhelming when we hit zero-day, but the “Page Experience Update”, as Core Web Vitals’ rollout has been named, has felt not just underwhelming, but more than a little fumbled. This post is part of a 3-part series, where we’ll cover where we stand now, how to understand it, and what to do next.
Fumbled, you say?
Google was initially vague, telling us back in May 2020 that the update would be “in 2021”. Then, in November 2020, they told us it’d be in May 2021 — an unusually long total lead time, but so far, so good.
The surprise came in April, when we were told the update was delayed to June. And then in June, when it started rolling out “very slowly”. Finally, at the start of September, after some 16 months, we were told it was done.
So, why do I care? I think the delays (and the repeated clarifications and contradictions along the way) suggest that Google’s play didn’t quite work out this time. They told us that we should improve our websites’ performance because it was going to be a ranking factor. But for whatever reason, perhaps we didn’t improve them, and their data was a mess anyhow, so Google was left to downplay their own update as a “tiebreaker”. This is confusing and disorientating for businesses and brands, and detracts from the overall message that yes, come what may, they should work on their site performance.
As John Mueller said, “we really want to make sure that search remains useful after all”. This is the underlying bluff in Google’s pre-announced updates: they can’t make changes that cause the websites people expect to see, to not rank.
Y’all got any data?
Yes, of course. What do you think we do here?
You may be familiar with our lord and savior, Mozcast, Moz’s Google algorithm monitoring report. Mozcast is based on a corpus of 10,000 competitive keywords, and back in May I decided to look at every URL ranking top 20 for all of these keywords, on desktop or on mobile, as tracked from a random location in the suburban USA.
This was some 400,000 results, and (surprisingly, I felt) ~210,000 unique URLs.
At the time, only 29% of these URLs had any CrUX data — this is data collected from real users in Google Chrome, and the basis of Core Web Vitals as a ranking factor. It’s possible for a URL to not have CrUX data because a certain sample size is needed before Google can work with the data, and for many lower traffic URLs, there is not enough Chrome traffic to fill out this sample size. This 29% is an especially depressingly low number when you consider that these are, by definition, higher traffic pages than most — they rank top 20 for competitive terms, after all.
Google has made various equivocations around generalizing/guesstimating results based on page similarity for pages that don’t have CrUX data, and I can imagine this working for large, templated sites with long tails, but less so smaller sites. In any case, in my experience working on large, templated sites, two pages on the same template often had vastly different performance, particularly if one was more heavily trafficked, and therefore more thoroughly cached.
Anyhow, leaving that rabbit hole to one side for a moment, you might be wondering what the Core Web Vitals outlook actually was for this 29% of URLs.
Some of these stats are quite impressive, but the real issue here is that “all 3” category. Again Google has gone and contradicted itself back and forth on whether you need to pass a threshold for all three metrics to get a performance boost, or indeed whether you need to pass any threshold at all. Still, what they have told us concretely is that we should try to meet these thresholds, and what we haven’t done is hit that bar.
30.75% passed all thresholds, of the 29% that even had data in the first place. 30.75% of 29% roughly equals 9%, 9% of URLs or thereabouts can concretely be said to be doing alright. Applying any significant ranking boost to 9% of URLs probably isn’t good news for the quality of Google’s results — especially as household name brands are very, very likely to be rife among the 91% left out.
So this was the situation in May, which (I hypothesize) led Google to postpone the update. What about August, when they finally rolled it out?
CrUX data availability increased from 29% to 38% between May and August 2021.
The rate of URLs with CrUX data passing all three CWV thresholds increased from 30.75% to 36.3% between May and August 2021.
So, the new multiplication (36.3% of 38%) leaves us at 14% – a marked increase over the previous 9%. Partly driven by Google collecting more data, partly by websites getting their stuff together. Presumably this trend will only increase, and Google will be able to turn up the dial on Core Web Vitals as a ranking factor, right?
More on that in parts 2 and 3 🙂
In the meantime, if you’re curious about where you stand for your site’s CWV thresholds, Moz has a tool for it currently in beta with the official launch coming in mid-to-late October.
Competitive research is a common and necessary task in any marketing landscape. This practice is particularly crucial in digital marketing because the ecosystem rapidly changes and brands constantly battle against each other for users across multiple platforms.
In the ideal scenario, performing competitive content research illuminates where your brand’s online content falters compared to competitors. With this information, you can solder the frail links in your marketing strategy and try to usurp the competition with superior content. The results should improve your brand’s content authority, keyword rankings, and organic share of voice.
However, competitive research rarely offers cut-and-dry wins. Your best practice acumen must be strong enough to scrounge for insights among multiple sources with varying content quality. You need to understand what content matters and what’s fluff. And ultimately, you have to know why some choices are more valuable and useful than others.
All of these factors make competitive research tricky. Because if you don’t discern the best content decisions, then you’re going to step into a pitfall trap and end up in a worse spot than before the research — especially if you emulate competitors whose approach to content is wrong, inadequate, or a bad fit for your ideal users.
To prevent turning your brand into a cautionary tale, you need to carefully choose what competitors you research, locate relevant pain points, and determine how effective their marketing strategy is.
Identifying competitors: Avoid the narrow path
When we think of competitors, we often think about direct competitors — the brands that offer similar products or solutions and vie for the same users online and in brick-and-mortar stores, such as Patagonia versus Prana.
Evaluating direct competitors’ content is a great place to start competitive research, but this narrow view is only half of the digital marketing equation. You need to widen your path and analyze how your content stacks up against SERP competitors, too. This panoramic view is even more important for small businesses that compete with national chains, like a local independent bookstore versus Barnes & Noble.
Unfortunately, many companies overlook the value of analyzing SERP rankings and organic share of voice for their vertical. Sometimes, this choice is because a brand doesn’t directly compete with the websites that rank in the top positions. In other scenarios, a company won’t have the resources to tackle both segments at once and must focus on either direct competition or SERP rankings.
Regardless of the situation, excluding researching SERP competitors in favor of your direct competition is an enormous mistake.
For example, let’s say you’re shopping for rock climbing pants and are indifferent to the brand you buy. Patagonia and Prana both sell climbing pants that you can purchase directly from their website and both brands rank for “rock climbing pants” on the first page. However, neither brand breaks above the fold with its rankings. Patagonia ranks in position seven and Prana is in position eight.
The top organic position is owned by a niche climbing website with a review of different climbing pants. This website has a domain authority of 50, while Prana and Pataongia have domain authorities of 73 and 85, respectively.
The user’s search intent is the same for every result on the first page: buying climbing pants. However, in this example, apparently neither Prana nor Patagonia focused on indirect SERP competitors. If they had, they’d recognize that brand-agnostic users, such as people who use generic search terms, often buy products based on reviews and recommendations.
Google recognizes this user desire, which is why the term is increasingly ranking best-of lists higher than product pages.
Given the domain rating of both companies and their expansive resources compared to a small, niche website, if either brand used their influencers to create unbiased review-focused content for the “rock climbing pants” keyword, they’d likely clinch the number one ranking with relative ease.
Instead, these companies are relegated down the page and must use paid advertising to compete for users’ attention.
Ultimately, accurate content analysis comes from gleaning insights from both SERP and direct competitors.
For example, let’s say you operate a B2B contact center software company for small businesses and want to rank for the ambitious term, “contact center software.” You have three direct competitors with a similar domain ranking and each of them rank somewhere on the first page. The other rankings are dominated by “best software” listicles.
This split search intent creates a delicate ranking environment and fierce competition. To have any chance of ranking on the first page, you’ll need to carefully pick-and-pull the best content aspects of both the listicles and direct competitors. And that requires knowing how to identify the right competitor to review.
How to choose competitors to review
Instead of getting sucked into the trap of balancing the analysis of SERP and direct competitors, focus on competitors who are trying to achieve the same goal and that you have an honest chance of dethroning.
If you want to improve your website’s content, any competitor you research should meet the following criteria:
The brand’s services and content are relevant and target your ideal user group
The brand follows content strategy and SEO best practices or is innovating effective alternatives
The brand ranks well on SERPs for your target keywords
The content this brand has ranking is relevant toward your brand’s users and business goals
Your brand’s domain rating and page authority are reasonably competitive, so changes have the potential to spur keyword growth
You have the resources to directly compete with the brand’s online authority and presence
There are always exceptions to these rules, such as brands that don’t need a robust online presence because they rely on third-party contracts and word-of-mouth to survive, like government contractors. However, for the average B2B and B2C company, choosing competitors with these guidelines in mind will keep your attention focused on worthy competition and not riff-raff.
Identifying pain points
Once you know who your competitors are, you need to know what content to analyze and how to determine why their version is superior to yours. These choices come down to knowing your brand’s pain points.
Not understanding or researching your own pain points before delving into competitive research is a huge mistake. Pain points allow you to focus your competitive analysis. Without knowing what you want to fix, you’re aiming in the dark when you research the competitor’s content. Without light to guide you, it’s extremely easy to emulate ideas you shouldn’t or try to compete with a website that’s incompatible with your goals or organic authority.
What pain points should you focus on?
Ultimately, your business goals and content KPIs should determine what pain points you focus on. Let the slipping conversions, plummeting newsletter sign-ups, or poor website performance metrics guide your path.
Let’s say you run a documentary streaming service and are struggling to get users to sign up for a trial after reading relevant blog posts or research papers. You know one of your competitors doesn’t have this churn, so you plan to read their related content and see how the experience is superior.
Before you can dive into the competitor’s service and learn why they earn trialists, you need to know why your users refuse to join.
In this scenario, your best option will be user research, such as:
User interviews
A/B tests
Surveys
Usability tests
Heatmap tracking
Net promoter score analysis
Once you determine why your brand is failing, then you can critically consider how your opponent solves the issues users have with your brand’s service.
The trick to knowing if a competitor’s pain point solution will work for your brand is understanding why it works for the competitor. There are plenty of ways to gain this knowledge, including best practice awareness, running the competition’s idea through a user research gauntlet, and comparing the options side-by-side.
These insights all rely on one common theme: the competition is following best practices and doing everything correct. However, competitors are fallible and often don’t offer users an ideal experience or perfect content. So what happens when the competition is wrong?
What if the competition is wrong?
Even if a competitor passes your initial screening and seems like a great brand from which to discover your weaknesses, first impressions can be deceiving.
There are plenty of mischievous marketing practices businesses can participate in that you wouldn’t notice at first glance, such as black-hat link building or paying users for positive reviews. And there are many innocent mistakes that your competitors may make that will harm your website if you implement them, like lackluster accessibility standards.
The amount of due diligence you perform should correlate with the amount of risk you undertake to emulate an idea or strategy.
For low-risk ideas, like rewriting a competitor’s blog post, the due diligence can be extremely simple, such as checking the post’s sources, keyword targets, and backlinks.
High-risk ideas, like overhauling your product pages or customer journey, need a more robust background check.
Here are a handful of red flags that should encourage you to avoid a competitor or at least do a deeper dive into their website:
Content automation (like scraper blogs) or similar signs of low-quality content
Link cloaking
Guest posting networks or other content sharing ecosystems
Link farms, private blog networks, or similar manipulation
Multiple domains with duplicate content
Paid user reviews or similar manipulation
Social media manipulation
Comment spam
Fraudulent cookies
Hidden text
How to spot when the competition is wrong
To prevent adopting erroneous high-risk ideas, you should always ask yourself the following four questions:
Does the brand’s content adhere to content strategy, SEO, and UX best practices?
Is the content meaningful, and how is its value communicated to users?
Why do you think the brand created this content?
If you implemented a similar (or the same) idea, how would your updated website and its content improve user experience?
These four questions act as a check-and-balance system for new ideas. They force you to consider the justifications of why a competitor made its decisions, how users may respond, and the consequences of copying those choices. Although this process isn’t necessary for every improvement you may glean from a competitor, it’s worthwhile when you’re considering significant changes that can swing KPIs toward success or failure.
Now, go avoid competitive research pitfalls
Competitive research is a necessary marketing strategy, and it’s immensely valuable if you take the time to ensure you’re evaluating a worthy competitor. While it’s easy to skimp on the background research and assume your competitors know what they’re doing, based on search rankings or public opinion, they may not be the skilled marketers you presume and you’ll end up wasting time, resources, and users on a faulty idea.
Here’s a quick reminder of what you should do to prepare yourself for competitive research and avoid implementing bad ideas:
Identify a mix of direct and SERP competitors that have relevant content, are trying to accomplish the same goal, and target the same users.
Determine your brand’s pain points and analyze how the competitors solve similar problems.
Do background research on your competitors and their content choices to ensure they follow content strategy, SEO, and UX best practices.
Want to use Python, but don’t know where to begin? Britney and Pumpkin are here in their second episode as co-hosts with more great tips on how to get started!
Click on the whiteboard image above to open a larger version in a new tab!
Video Transcription
Hi, Moz fans. Welcome to another edition of Whiteboard Friday. I’m your host Britney Muller. I was previously Moz’s Senior SEO Scientist, and now I am freelance consulting and building some data science programs on the side.
This is my very special co-host, Pumpkin. You might remember her from the first Python episode. She’s gotten quite a bit bigger. Quarantine was really good to her. She’s very healthy and very sweet. I love her so, so much. This is my best buddy right here.
So we have been hard at work preparing Python 2.0 for you all, and we’re so excited to show you what we put together. So let’s just get started.
Why Python?
All right. So we kind of went over this in the first Python video, but just to recap.
On the first video I got to hold her in one hand. It’s a bit harder now. That’s actually why I’m wearing this. I thought I could maybe BabyBjörn you. Oh, she’s fine.
Okay, So just to recap, why Python? It’s talked about so much in the SEO community. Why is this sort of the program that most people prefer?
Simple syntax
So there’s very simple syntax. It’s sort of more common sense than other programming languages. It also uses a ton of white space. So you’re going to see tabs and sort of white spaces instead of curly brackets like some other common programming languages.
It’s concise
Did you have something to say? It’s very concise. Often there are fewer lines of code to do one thing than there might be in another language, which is very, very nice.
It’s versatile
It’s also very versatile. It works on many different platforms, and it can work in a different variety of ways. As far as procedural, you’ve probably heard of object-oriented and functional programming.
It kind of covers the gamut in that way, which is really great. You think so too? Pumpkin says she thinks so too, and it’s awesome.
Getting started
So let’s get started. So whether you’re on a Mac or a Windows, you can open up a terminal and Python should come with your Mac OS system.
There’s 2.7 kind of natively installed, and we can just use that. Or go ahead and just open a Colab notebook. So this is a Google property we’ll link to down below. You can create a new code cell. All I want you to do is simply type in print parentheses.
Sorry, what are these? Help me. Parentheses and then quotes, sorry. We’re quarantined. You know? It’s just been us two.
So print (“Hello World”) and then Shift + Enter. Congratulations, you’ve just run Python.
So we’re off to the races. You’re all Python experts basically.
Python fundamentals
Now, let’s kind of cover some of the fundamentals. These are really important especially just to be aware of as you kind of continue exploring — oh, is she on my mic, sorry — as you continue exploring Python.
Basic syntax
So we’re first just going to go over some of the basic syntax, and there’s obviously a lot more than just this, but some of the common things.
1. Variables
Variables are super, super important in Python. So this is where you just assign values to words or whatever variables you’re working with. This is sort of a silly tax price example here, where we assign a numerical value to tax and we do the same for price.
Pumpkin is showing you. She’s very excited about this example. You simply run this within Python, and you will get your price plus the tax that we have stated here. So it’s kind of a cool application just to quickly get a feel of how variables work and how when you’re dealing with numerical variables, you can do a variety of calculations.
So that’s a super powerful thing within Python and really fun to play around with.
2. Comments
Second big, big important syntax is comments. So if you have something to say, like Pumpkin here, you have to put kind of a hash and then write your comment after that.
Commonly people will use these to explain the code that’s after the comment. So you can kind of explain what you were trying to do there. It’s also very useful if you want to comment out code. So I use this all the time when I’m kind of fumbling and trying to do different things within a Colab notebook and it’s not working.
I will just comment out different things and try different ways, and oftentimes that helps me kind of find solutions quickly.
3. Data types
The next and perhaps the most powerful thing, especially if you want to start using Python for data analysis, so let’s say you want to start pulling in Google Search Console data or Google Analytics, so, so important to be aware of the different data types.
So if you’re pulling in text, like keywords from Search Console, it should be picked up in Python as string (str). Sometimes this gets screwed up when you import data. So it’s really important to have the proper data types assigned to your different types of data so that you can perform correct calculations.
So for numeric values, you have integer or just int, float, and complex. If your numbers aren’t in these data types, you won’t be able to run different calculations on them. So again, just to be aware that these exist and really the gist of it is you basically just want your data to be reflective of the proper Python data types.
So sequence is listed as those three — list, tuple, and range. Mapping is really common if you’re using dictionary type things within different programs. Then, of course, our most common, Boolean, which is true or false, is just bool.
So is Pumpkin a big, happy girl? True. She’s actually a boy. That’s a long story. But you can call her whatever. She’s having so much fun, and she’s so happy to be here.
4. If…else
Lastly, the if…else statement. So there’s a number of different statements that you can use.
But arguably one of the more common is if else. So just a really silly example, let’s say you have Website A ranking 13 for a keyword and Website B ranking 28. You can say print (“A”) if A < B else print (“B”). So this is just a really silly, quick example to kind of show you how you can use this.
But once you get into that, you get into elif and loops, and it gets really, really fun and exciting.
Conclusion
So hopefully, you start to play around with some of these and stay tuned for when we apply this to Google Search Console data. So thank you so much for kind of checking out this Python 2.0 basics.
My co-host is hiding behind my back right now, but she is really grateful that you all came to check out the second video of Python. So thank you, guys, so much and Pump and I will see you guys soon. Bye.
One of the biggest challenges in SEO is trying to convince your client or boss that the competition they face online may not match their legacy competitors and personal grudges. Big Earl across the street at Big Earl’s Widgets may be irritating and, sure, maybe he does have a “stupid, smug face,” but that doesn’t change the fact that WidgetShack.com is eating your lunch (and let’s not even talk about Amazon).
To make matters worse, competitive analysis is time-consuming and tedious work, even if you do have access to the data. Today, after years of rethinking how competitive analysis should work (and, honestly, re-rethinking it on many occasions), I’m proud to announce the first step in expanding Moz’s competitive analysis toolkit — True Competitor.
Before I dive into the details, let’s take it out for a spin. Just enter your domain or subdomain and your locale (the beta supports English-language markets in the United States, Great Britain, Australia, and Canada):
Then let the tool do its work. You’ll get back something like this:
True Competitor pulls ranking keywords (by highest-volume) for any domain in our Keyword Explorer database — even your competitors’ and prospects’ domains — and analyzes recent Google SERPs to find out who you’re truly competing against.
What are Overlap and Rivalry?
Hopefully, you’re already familiar with our proprietary Domain Authority (DA) metric, but Overlap and Rivalry are new to True Competitor. Overlap is simple — it’s the percentage of shared keywords where the target site and the competitor both ranked in the top 10 traditional organic results. This is essentially a Share of Voice (SoV) metric. It’s a good first stop, and you can sort by DA or Overlap for multiple views of the data — but what if the keywords you overlap on aren’t particularly relevant, or a competitor is just too far out of reach?
That’s where Rivalry comes in. Rivalry factors in the Click-Thru Rate (CTR) and volume of overlapping keywords, the target site’s ranking (keywords where the target ranks higher are more likely to be relevant), and the proximity of the two sites’ DA scores to help you sort which competitors are the most relevant and realistic.
What can you do with this data?
Hopefully, you can use True Competitor to validate your own assumptions, challenge bad assumptions, and learn about competitors you might not have considered. That’s not all, though — select up to two competitors for in-depth information:
Just click on [ + Analyze Competitors ] and your selections will be auto-filled in our Keyword Overlap tool in Keyword Explorer. Here, you can dive deeper into your keyword overlap and find specific keywords to target with your SEO efforts:
We’re currently working on new ways to analyze this data and help you surface the most relevant keyword and content overlaps. We hope to have more to announce in Q4.
This list doesn’t match my list!
GOOD. Sorry, that’s a little flippant. Ultimately, we hope there’s something new and unexpected in this data. Otherwise, what’s the point? The goal of True Competitor is to help you see who you’re really up against in Google rankings. How you use that information is up to you.
I’d like to challenge you, dear reader, on one point. We have a bad habit of thinking of the “competition” as a single, small set of sites or companies. In the example above, I chose to explore SEMrush and Ahrefs, because they’re our most relevant product competitors. Consider if I had taken a different route:
Looking at our SEO news competitors paints a different but also very useful picture, especially for our content team and writers. We also have multiple Google subdomains showing up in our Top 25 — some Google products (like Google Search Console) are competitors, and some (like Google Analytics) are simply of interest to our readership and topics that we cover.
My challenge to you is to really think about these different spheres of competition and move beyond a singular window of what “competitor” means. You may not target all of these competitors or even care about them all on any given day, and that’s fine, but each window is an area that might uniquely inform your SEO and content strategies.
As a Subject Matter Expert at Moz, I have the privilege of working on multiple parts of our product, but this project is something I’ve been thinking about for a long time and is near and dear to me. I’d like to personally thank our Product team — Igor, Hayley, and Darian — for all of their hard work, leadership, and pushback to make this product better. Many thanks also to our App Front-end Engineering team, and a special shout-out to Maura and Grant for helping port the original prototype into an actual product.
Get started with True Competitor
True Competitor is currently available in beta for all Moz Pro customers and community accounts.
We welcome your feedback — please click on the [Make a Suggestion] button in the upper-right of the True Competitor home-page if you have any specific comments or concerns.
Did you ever turn in a school paper full of vague ramblings, hoping your teacher wouldn’t notice that you’d failed to read the assigned book?
I admit, I once helped my little sister fulfill a required word count with analogies about “waves crashing against the rocks of adversity” when she, for some reason, overlooked reading The Communist Manifesto in high school. She got an A on her paper, but that isn’t the mark I’d give Google when there isn’t enough content to legitimately fill them local packs, Local Finders, and Maps.
The presence of irrelevant listings in response to important local queries:
Makes it unnecessarily difficult for searchers to find what they need
Makes it harder for relevant businesses to compete
Creates a false impression of bountiful local choice of resources, resulting in disappointing UX
Today, we’ll look at some original data in an attempt to quantify the extent of this problem, and explore what Google and local businesses can do about it.
What’s meant by “local filler” content and why is it such a problem?
The above screenshot captures the local pack results for a very specific search for a gastroenterologist in Angels Camp, California. In its effort to show me a pack, Google has scrambled together results that are two-thirds irrelevant to the full intent of my query, since I am not looking for either an eye care center or a pediatrician. The third result is better, even though Google had to travel about 15 miles from my specified search city to get it, because Dr. Eddi is, at least, a gastroenterologist.
It’s rather frustrating to see Google allowing the one accurate specialist to be outranked by two random local medical entities, perhaps simply because they are closer to home. It obviously won’t do to have an optometrist or children’s doctor consult with me on digestive health, and unfortunately, the situation becomes even odder when we click through to the local finder:
Of the twenty results Google has pulled together to make up the first page of the local finder, only two are actually gastroenterologists, lost in the weeds of podiatrists, orthopedic surgeons, general MDs, and a few clinics with no clarity as to whether their presence in the results relates to having a digestive health specialists on staff . Zero of the listed gastroenterologists are in the town I’ve specified. The relevance ratio is quite poor for the user and shapes a daunting environment for appropriate practitioners who need to be found in all this mess.
You may have read me writing before about local SEO seeking to build the online mirror of real-world communities. That’s the ideal: ensuring that towns and cities have an excellent digital reference guide to the local resources available to them. Yet when I fact-checked with the real world (calling medical practices around this particular town), I found that there actually are no gastroenterologists in Angels Camp, even though Google’s results might make it look like there must be. What I heard from locals is that you must either take a 25 minute drive to Sonora to see a GI doctor, or head west for an hour and fifteen minutes to Modesto for appropriate care.
Google has yoked itself to AI, but the present state of search leaves it up to my human intelligence to realize that the SERPs are making empty promises, and that there are, in fact, no GI docs in Angels Camp. This is what a neighbor, primary care doctor, or local business association would tell me if I was considering moving to this community and needed to be close to specialists. But Google tells me that there are more than 23 million organic choices relevant to my requirements, and scores of local business listings that so closely match my intent, they deserve pride of place in 3-packs, Finders and Maps.
The most material end result for the Google user is that they will likely experience unnecessary fatigue wasting time on the phone calling irrelevant doctors at a moment when they are in serious need of help from an appropriate professional. As a local SEO, I’m conditioned to look at local business categories and can weed out useless content almost automatically because of this, but is the average searcher noticing the truncated “eye care cent…” on the above listing? They’re almost certainly not using a Chrome extension like GMB Spy to see all the possible listing categories since Google decided to hide them years ago.
On a more philosophical note, my concern with local SERPs made up of irrelevant filler content is that they create a false picture of local bounty. As I recently mentioned to Marie Haynes:
The work of local businesses (and local SEOs!) derives its deepest meaning from providing and promoting essential local resources. Google’s inaccurate depiction of abundance could, even if in a small way, contribute to public apathy. The truth is that the US is facing a severe shortage of doctors, and anything that doesn’t reflect this reality could, potentially, undermine public action on issues like why our country, unlike the majority of nations, doesn’t make higher education free or affordable so that young people can become the medical professionals and other essential services providers we unquestionably need to be a functional society. Public well-being depends on complete accuracy in such matters.
As a local SEO, I want a truthful depiction of how well-resourced each community really is on the map, as a component of societal thought and decision-making. We’re all coping with public health and environmental emergencies now and know in our bones how vital essential local services have become.
Just how big is the problem of local filler content?
If the SERPs were more like humans, my query for “gastroenterologist Angels Camp” would return something like a featured snippet stating, “Sorry, our index indicates there are no GI Docs in Angels Camp. You’ll need to look in Sonora or Modesto for nearest options.” It definitely wouldn’t create the present scenario of, “Bad digestive system? See an eye doctor!” that’s being implied by the current results. I wanted to learn just how big this problem has become for Google.
I looked at the local packs in 25 towns and cities across California of widely varying populations using the search phrase “gastroenterologist” and each of the localities. I noted how many of the results returned were within the city specified in my search and how many used “gastroenterologist” as their primary category. I even gave Google an advantage in this test by allowing entries that didn’t use gastroenterologist as their primary category but that did have some version of that word in their business title (making the specialty clearer to the user) to be included in Google’s wins column. Of the 150 total data points I checked, here is what I found:
42% of the content Google presented in local packs had no obvious connection to gastroenterology. It’s a shocking number, honestly. Imagine the number of wearying, irrelevant calls patients may be making seeking digestive health consultation if nearly half of the practices listed are not in this field of medicine.
A pattern I noticed in my small sample set is that larger cities had the most relevant results. Smaller towns and rural areas had much poorer relevance ratios. Meanwhile, Google is more accurate as to returning results within the query’s city, as shown by these numbers:
The trouble is, what looks like more of a win for Google here doesn’t actually chalk up as a win for searchers. In my data set, where Google was accurate in showing results from my specified city, the entities were often simply not GI doctors. There were instances in which all 3 results got the city right, but zero of the results got the specialty right. In fact, in one very bizarre case, Google showed me this:
Welders aside, it’s important to remember that our initial Angels Camp example demonstrated how the searcher, encountering a pack with filler listings in it and drilling down further into the Local Finder results for help may actually end up with even less relevance. Instead of two-out-of-three local pack entries being useless to them, they may end up with two-out-of-twenty unhelpful listings, with relevance consigned to obscurity.
And, of course, filler listings aren’t confined to medical categories. I engaged in this little survey because I’d noticed how often, in category after category, the user experience is less-than-ideal.
What should Google do to lessen the poor UX of irrelevant listings?
Remember that we’re not talking about spam here. That’s a completely different headache in Googleland. I saw no instances of spam in my data. The welder was not trying to pass himself off as a doctor. Rather, what we have here appears to be a case of Google weighting location keywords over goods/services keywords, even when it makes no sense to do so.
Google needs to develop logic that excludes extremely irrelevant listings for specific head terms to improve UX. How might this logic work?
1. Google could rely more on their own categories. Going back to our original example in which an eye care center is the #1 ranked result for “gastroenterologist angels camp”, we can use GMB Spy to check if any of the categories chosen by the business is “gastroenterologist”:
Google can, of course, see all the categories, and this lack of “gastroenterologist” among them should be a big “no” vote on showing the listing for our query.
2. Google could cross check the categories with the oft-disregarded business description:
Again, no mention of gastroenterological services there. Another “no” vote.
3. Google could run sentiment analysis on the reviews for an entity, checking to see if they contain the search phrase:
Lots of mentions of eye care here, but the body of reviews contains zero mentions of intestinal health. Another “no” vote.
4. Google could cross check the specified search phrases against all the knowledge they have from their crawls of the entity’s website:
This activity should confirm that there is no on-site reference to Dr. Haymond being anything other than an ophthalmologist . Then Google would need to make a calculation to downgrade the significance of the location (Angels Camp) based on internal logic that specifies that a user looking for a gastroenterologist in a city would prefer to see gastroenterologists a bit farther away than seeing eye doctors (or welders) nearby. So, this would be another “no” vote for inclusion as a result for our query.
5. Finally, Google could cross reference this crawl of the website against their wider crawl of the web:
This should act as a good, final confirmation that Dr. Haymond is an eye doctor rather than a gastroenterologist, even if he is in our desired city, and give us a fifth “no” vote for bringing his listing up in response to our query.
The web is vast, and so is Google’s job, but I believe the key to resolving this particular type of filler content is for Google to rely more on the knowledge they have of an entity’s vertical and less on their knowledge of its location. A diner may be willing to swap out tacos for pizza if there’s a Mexican restaurant a block away but no pizzerias in town, but in these YMYL categories, the same logic should not apply.
It’s not uncommon for Google to exclude local results from appearing at all when their existing logic tells them there isn’t a good answer. It’s tempting to say that solving the filler content problem depends on Google expanding the number of results for which they don’t show local listings. But, I don’t think this is a good solution, because the user then commonly sees irrelevant organic entries, instead of local ones. It seems to me that a better path is for Google to expand the radius of local SERPs for a greater number of queries so that a search like ours receives a map of the nearest gastroenterologists, with closer, superfluous businesses filtered out.
What should you do if a local business you’re promoting is getting lost amid filler listings?
SEO is going to be the short answer to this problem. It’s true that you can click the “send feedback” link at the bottom of the local finder, Google Maps or an organic SERP, and fill out form like this, with a screenshot:
However, my lone report of dissatisfaction with SERP quality is unlikely to get Google to change the results. Perhaps if they received multiple reports…
More practically-speaking, if a business you’re promoting is getting lost amid irrelevant listings, search engine optimization will be your strongest tool for convincing Google that you are, in fact, the better answer. In our study, we realized that there are, in fact, no GI docs in Angels Camp, and that the nearest one is about fifteen miles away. If you were in charge of marketing this particular specialist, you could consider:
1. Gaining a foothold in nearby towns and cities
Recommend that the doctor develop real-world relationships with neighboring towns from which he would like to receive more clients. Perhaps, for example, he has hospital privileges, or participates in clinics or seminars in these other locales.
2. Writing about locality relationships
Publish content on the website highlighting these relationships and activities to begin associating the client’s name with a wider radius of localities.
3. Expanding the linktation radius
Seek relevant links and unstructured citations from the neighboring cities and towns, on the basis of these relationships and participation in a variety of community activities.
4. Customizing review requests based on customers’ addresses
If you know your customers well, consider wording review requests to prompt them to mention why it’s worth it to them to travel from X location for goods/services (nota bene: medical professionals, of course, need to be highly conversant with HIPPA compliance when it comes to online reputation management).
5. Filling out your listings to the max
Definitely do give Google and other local business listing platforms the maximum amount of information about the business you’re marketing (Moz Local can help!) . Fill out all the fields and give a try to functions like Google Posts, product listings, and Q&A.
6. Sowing your seeds beyond the walled garden
Pursue an active social media, video, industry, local news, print, radio, and television presence to the extent that your time and budget allows. Google’s walled garden, as defined by my friend, Dr. Pete, is not the only place to build your brand. And, if my other pal, Cyrus Shepard, is right, anti-trust litigation could even bring us to a day when Google’s own ramparts become less impermeable. In the meantime, work at being found beyond Google while you continue to grapple with visibility within their environment.
Study habits
It’s one thing for a student to fudge a book report, but squeaking by can become a negative lifelong habit if it isn’t caught early. I’m sure any Google staffer taking the time to actually read through the local packs in my survey would agree that they don’t rate an A+.
I’ve been in local SEO long enough to remember when Google first created their local index with filler content pulled together from other sources, without business owners having any idea they were even being represented online, and these early study habits seem to have stuck with the company when it comes to internal decision making that ends up having huge real-world impacts. The recent title tag tweak that is rewriting erroneous titles for vaccine landing pages is a concerning example of this lack of foresight and meticulousness.
If I could create a syllabus for Google’s local department, it would begin with separating out categories of the greatest significance to human health and safety and putting them through a rigorous, permanent manual review process to ensure that results are as accurate as possible, and as free from spam, scams, and useless filler content as the reviewers can make them. Google has basically got all of the money and talent in the world to put towards quality, and ethics would suggest they are obliged to make the investment.
Society deserves accurate search results delivered by studious providers, and rural and urban areas are worthy of equal quality commitments and a more nuanced approach than one-size-fits all. Too often, in Local, Google is flunking for want of respecting real-world realities. Let’s hope they start applying themselves to the fullest of their potential.
In today’s episode of Whiteboard Friday, Tom covers a more advanced SEO concept: crawl budget. Google has a finite amount of time it’s willing to spend crawling your site, so if you’re having issues with indexation, this is a topic you should care about.
Click on the whiteboard image above to open a larger version in a new tab!
Video Transcription
Happy Friday, Moz fans, and today’s topic is crawl budget. I think it’s worth saying right off the bat that this is somewhat of a more advanced topic or one that applies primarily to larger websites. I think even if that’s not you, there is still a lot you can learn from this in terms of SEO theory that comes about when you’re looking at some of the tactics you might employ or some of the diagnostics you might employ for a crawl budget.
But in Google’s own documentation they suggest that you should care about crawl budget if you have more than a million pages or more than 10,000 pages that are updated on a daily basis. I think those are obviously kind of hard or arbitrary thresholds. I would say that if you have issues with your site getting indexed and you have pages deep on your site that are just not getting into the index that you want to, or if you have issues with pages not getting indexed quickly enough, then in either of those cases crawl budget is an issue that you should care about.
What is crawl budget?
So what actually is crawl budget? Crawl budget refers to the amount of time that Google is willing to spend crawling a given site. Although it seems like Google is sort of all-powerful, they have finite resources and the web is vast. So they have to prioritize somehow and allocate a certain amount of time or resource to crawl a given website.
Now they prioritize based on — or so they say they prioritize based on the popularity of sites with their users and based on the freshness of content, because Googlebot sort of has a thirst for new, never-before-seen URLs.
We’re not really going to talk in this video about how to increase your crawl budget. We’re going to focus on how to make the best use of the crawl budget you have, which is generally an easier lever to pull in any case.
Causes of crawl budget issues
So how do issues with crawl budget actually come about?
Facets
Now I think the main sort of issues on sites that can lead to crawl budget problems are firstly facets.
So you can imagine on an e-comm site, imagine we’ve got a laptops page. We might be able to filter that by size. You have a 15-inch screen and 16 gigabytes of RAM. There might be a lot of different permutations there that could lead to a very large number of URLs when actually we’ve only got one page or one category as we think about it — the laptops page.
Similarly, those could then be reordered to create other URLs that do the exact same thing but have to be separately crawled. Similarly they might be sorted differently. There might be pagination and so on and so forth. So you could have one category page generating a vast number of URLs.
Search results pages
A few other things that often come about are search results pages from an internal site search can often, especially if they’re paginated, they can have a lot of different URLs generated.
Listings pages
Listings pages. If you allow users to upload their own listings or content, then that can over time build up to be an enormous number of URLs if you think about a job board or something like eBay and it probably has a huge number of pages.
Fixing crawl budget issues
So what are some of the tools that you can use to address these issues and to get the most out of your crawl budget?
So as a baseline, if we think about how a normal URL behaves with Googlebot, we say, yes, it can be crawled, yes, it can be indexed, and yes, it passes PageRank. So a URL like these, if I link to these somewhere on my site and then Google follows that link and indexes these pages, these probably still have the top nav and the site-wide navigation on them. So the link actually that’s passed through to these pages will be sort of recycled round. There will be some losses due to dilution when we’re linking through so many different pages and so many different filters. But ultimately, we are recycling this. There’s no sort of black hole loss of leaky PageRank.
Robots.txt
Now at the opposite extreme, the most extreme sort of solution to crawl budget you can employ is the robots.txt file.
So if you block a page in robots.txt, then it can’t be crawled. So great, problem solved. Well, no, because there are some compromises here. Technically, sites and pages blocked in robots.txt can be indexed. You sometimes see sites showing up or pages showing up in the SERPs with this meta description cannot be shown because the page is blocked in robots.txt or this kind of message.
So technically, they can be indexed, but functionally they’re not going to rank for anything or at least anything effective. So yeah, well, sort of technically. They do not pass PageRank. We’re still passing PageRank through when we link into a page like this. But if it’s then blocked in robots.txt, the PageRank goes no further.
So we’ve sort of created a leak and a black hole. So this is quite a heavy-handed solution, although it is easy to implement.
Link-level nofollow
Link-level nofollow, so by this I mean if we took our links on the main laptops category page, that were pointing to these facets, and we put a nofollow attribute internally on those links, that would have some advantages and disadvantages.
I think a better use case for this would actually be more in the listings case. So imagine if we run a used car website, where we have millions of different used car individual sort of product listings. Now we don’t really want Google to be wasting its time on these individual listings, depending on the scale of our site perhaps.
But occasionally a celebrity might upload their car or something like that, or a very rare car might be uploaded and that will start to get media links. So we don’t want to block that page in robots.txt because that’s external links that we would be squandering in that case. So what we might do is on our internal links to that page we might internally nofollow the link. So that would mean that it can be crawled, but only if it’s found, only if Google finds it in some other way, so through an external link or something like that.
So we sort of have a halfway house here. Now technically nofollow these days is a hint. In my experience, Google will not crawl pages that are only linked to through an internal nofollow. If it finds the page in some other way, obviously it will still crawl it. But generally speaking, this can be effective as a way of restricting crawl budget or I should say more efficiently using crawl budget. The page can still be indexed.
That’s what we were trying to achieve in that example. It can still pass PageRank. That’s the other thing we were trying to achieve. Although you are still losing some PageRank through this nofollow link. That still counts as a link, and so you’re losing some PageRank that would otherwise have been piped into that follow link.
Noindex, nofollow
Noindex and nofollow, so this is obviously a very common solution for pages like these on ecomm sites.
Now, in this case, the page can be crawled. But once Google gets to that page, it will discover it’s noindex, and it will crawl it much less over time because there is sort of less point in crawling a noindex page. So again, we have sort of a halfway house here.
Obviously, it can’t be indexed. It’s noindex. It doesn’t pass PageRank outwards. PageRank is still passed into this page, but because it’s got a nofollow in the head section, it doesn’t pass PageRank outwards. This isn’t a great solution. We’ve got some compromises that we’ve had to achieve here to economize on crawl budget.
Noindex, follow
So a lot of people used to think, oh, well, the solution to that would be to use a noindex follow as a sort of best of both. So you put a noindex follow tag in the head section of one of these pages, and oh, yeah, everyone is a winner because we still get the same sort of crawling benefit. We’re still not indexing this sort of new duplicate page, which we don’t want to index, but the PageRank solution is fixed.
Well, a few years ago, Google came out and said, “Oh, we didn’t realize this ourselves, but actually as we crawl this page less and less over time, we will stop seeing the link and then it kind of won’t count.” So they sort of implied that this no longer worked as a way of still passing PageRank, and eventually it would come to be treated as noindex and nofollow. So again, we have a sort of slightly compromised solution there.
Canonical
Now the true best of all worlds might then be canonical. With the canonical tag, it’s still going to get crawled a bit less over time, the canonicalized version, great. It’s still not going to be indexed, the canonicalized version, great, and it still passes PageRank.
So that seems great. That seems perfect in a lot of cases. But this only works if the pages are near enough duplicates that Google is willing to consider them a duplicate and respect the canonical. If they’re not willing to consider them a duplicate, then you might have to go back to using the noindex. Or if you think actually there’s no reason for this URL to even exist, I don’t know how this wrong order combination came about, but it seems pretty pointless.
301
I’m not going to link to it anymore. But in case some people still find the URL somehow, we could use a 301 as a sort of economy that is going to perform pretty well eventually for… I’d say even better than canonical and noindex for saving crawl budget because Google doesn’t even have to look at the page on the rare occasion it does check it because it just follows the 301.
It’s going to solve our indexing issue, and it’s going to pass PageRank. But obviously, the tradeoff here is users also can’t access this URL, so we have to be okay with that.
Implementing crawl budget tactics
So sort of rounding all this up, how would we actually employ these tactics? So what are the activities that I would recommend if you want to have a crawl budget project?
One of the less intuitive ones is speed. Like I said earlier, Google is sort of allocating an amount of time or amount of resource to crawl a given site. So if your site is very fast, if you have low server response times, if you have lightweight HTML, they will simply get through more pages in the same amount of time.
So this counterintuitively is a great way to approach this. Log analysis, this is sort of more traditional. Often it’s quite unintuitive which pages on your site or which parameters are actually sapping all of your crawl budget. Log analysis on large sites often yields surprising results, so that’s something you might consider. Then actually employing some of these tools.
So redundant URLs that we don’t think users even need to look at, we can 301. Variants that users do need to look at, we could look at a canonical or a noindex tag. But we also might want to avoid linking to them in the first place so that we’re not sort of losing some degree of PageRank into those canonicalized or noindex variants through dilution or through a dead end.
Robots.txt and nofollow, as I sort of implied as I was going through it, these are tactics that you would want to use very sparingly because they do create these PageRank dead ends. Then lastly, a sort of recent or more interesting tip that I got a while back from an Ollie H.G. Mason blog post, which I’ll probably link to below, it turns out that if you have a sitemap on your site that you only use for fresh or recent URLs, your recently changed URLS, then because Googlebot has such a thirst, like I said, for fresh content, they will start crawling this sitemap very often. So you can sort of use this tactic to direct crawl budget towards the new URLs, which sort of everyone wins.
Googlebot only wants to see the fresh URLs. You perhaps only want Googlebot to see the fresh URLs. So if you have a sitemap that only serves that purpose, then everyone wins, and that can be quite a nice and sort of easy tip to implement. So that’s all. I hope you found that useful. If not, feel free to let me know your tips or challenges on Twitter. I’m curious to see how other people approach this topic.
You must be logged in to post a comment.