Technical SEO Focus
This section highlights the critical areas of “technical” focus as well as solutions and resources to address common issues that are pervasive to websites large and small. For more comprehensive and deeper dives into technical SEO, you can review these resources:
Search Engine Journal: Advanced Technical SEO: A Complete Guide
Cognitive SEO: Technical SEO Checklist – The Roadmap to a Complete Technical SEO Audit
Page load speed
Why it’s important
Page load time is a significant SEO ranking factor
Speed impacts UX and can adversely affect website usability ranging from ecommerce abandonment rates to pageviews per session
Common challenges
Rapidly deploying code to get something “workable” can lead to unintended consequences for load time
Page load time requires discipline across both Engineering and Content teams; there is no “magic bullet”
General tactics and best practices
Shrink image file sizes; jpg format is typically best combo of image quality + size
Minify CSS, JavaScript, and HTML and enable compression; be careful to not eliminate important comments and formatting
Enable caching; consider using content delivery networks (CDN)
Test page load times, and test often
Resources
Mobile-optimized
Why it’s important
Ever-increasing usage of mobile smartphones demands front-burner attention
Google announced mobile-first indexing for the whole web
Common challenges
Responsive websites are great for SEO purposes, but legacy websites (read: most websites) design for desktop first, then adapt the website to mobile
We want mobile-optimized -- we can settle for mobile-friendly
Webmasters sometimes weigh the need for mobile improvements based on existing traffic performance
This is often a fallacy because the evaluation is based on what is currently happening as opposed to industry expectations
General tactics and best practices
When and where applicable, design with a mobile-first mentality
Habitually monitor performance at the device-category level
Evaluate device traffic mix based on industry benchmarks, when available
Meta robots + robots.txt
Why it’s important
These under-the-hood values can wreak havoc on websites if executed improperly
Google continues to evolve its algorithm to...
Minimize potential exploitation of black- and grey-hat tactics
Understand contextual signals
Common challenges
Misunderstanding of index/noindex, follow/nofollow, and crawl/blocked directives
Use, misuse, or lack of use of appropriate directives (ie: non-deliberate misconfigurations)
Dedicated landing pages, “thank-you” pages, and staging sites becoming indexed
Important pages blocking helpful crawlers/crawl bots (including Google!)
General tactics and best practices
Definitions of major rules
Index/noindex
Index: tells search engines to index the page; default setting
Noindex: tells search engines to not index the page
Follow/nofollow
Follow: bots should crawl the link and pass equity; default setting
Nofollow: differs based on where the tag is placed
Rule on a page: don’t crawl any links on the page and don’t pass equity
Rule on a link: don’t crawl the link and don’t pass equity
NOTE: Google recently changed their interpretation of nofollow tags as hints as opposed to directives
Robots.txt disallow
Tells search engines not to crawl a certain page, directory, etc.
IMPORTANT: in order for robots to read index/noindex and follow/nofollow directives, crawlers must be able to crawl the page
Develop and maintain processes with “SEO checks” for code releases and content publication
Monitor emerging SEO trends and update processes to reflect latest guidance
Resources
XML sitemaps
Why it’s important
In essence, an XML sitemap is a curated set of URLs you deem important for Google and other search bots to crawl for the purposes of indexing and ranking
Without an XML sitemap, Google crawls the site but has no guidance on what to crawl or where to crawl
Google may not actually have enough "crawl budget" to reach pages we want crawled and indexed
Common challenges
Sites have no XML sitemaps, which allows Google freedom to crawl freely without guidance
Sites have outdated and/or misconfigured XML sitemaps, which forces Google to consume and waste crawl budget
General tactics and best practices
Include vs exclude guidelines
We want to include content that is meant to be…
1) intentionally indexed to be found in organic search engines
2) reachable by users through internal links on the website
We do not want…
Content meant for specific purposes that do not meet the “want” criteria
Ex: thank-you page; landing pages for paid media, staging sites
Content that returns status code errors
Content that is blocked by robots.txt (ie: pages/sections with disallow directives)
Non-canonical URLs
Category, tag, archive, and pagination pages
Images, videos, etc.
Tools to fetch URLs and create sitemaps
The best source of truth is the actual content management system that hosts the content
Most CMSs should allow some type of export of all URLs within the CMS
WordPress, for example, has a few awesome plugins like Yoast and Google XML Sitemaps that are easily usable
Some CMSs may require some type of database export
Screaming Frog is a quintessential resource for a variety of reasons (< $200/yr)
If lacking ways to export URLs from the CMS, Screaming Frog can be used to crawl sites for all navigable links
Critical to remember that crawlers are only as effective as the internal links within a website -- if a page is not linked, the page will not be found in a crawl and will be considered “orphaned” (hence the CMS is the preferred route)
Screaming Frog can also develop XML sitemaps via the crawl and/or a manually uploaded list of URLs
Many other XML sitemap generators exist with some kind of cost
XML sitemap specs
Limitations
Any single XML sitemap is limited to 50,000 URLs
Uncompressed file size limit of 50MB
Sitemap for multiple domains/subdomains
Couple of options to go about it per Google
Make sure no conflicts between XML site map and robots directives
If robots.txt tells Google not to crawl a page that’s within the sitemap, we’re sending mixed messages
Similarly for page-specific robots directives ie: a page in the xml sitemap should not be set to noindex
Resources
Other important items
Https secured sites with proper redirect logic
Google favors https-secured sites in its rankings
Ensure that sites use secured certificate
All http links should redirect to https version
If there is a large presence of internal URLs created via absolute links, we’ll need to clean these to prevent unnecessary redirects (or worse, a bunch of redirect chains)
Canonical tags
Canonical tags designate the “original” version of any given page
Essentially, you’re telling Google the true/desired version of the page that should be indexed
Many different applications and principles for using canonicals
Ahrefs: Canonical Tags: A Simple Guide for Beginners is an incredibly useful read-through
Resources
Standardized URLs: (no) trailing slash
URLs/pages may render with or without the trailing slash
https://www.hello.com/abc vs https://www.hello.com/abc/ → can be seen as two different pages
Use of one variation and create redirect rule to always send users to the right version
Make sure canonical tags reflect the desired version
Any new pages published on the site should adhere to the standard variation
Standardized URLs: type case
Browsers will render pages with varying type cases
https://www.hello.com/abc vs https://www.hello.com/Abc vs https://www.hello.com/ABC → each page can be seen as a unique page
Use of one variation and create redirect rule to always send users to the right version
HUGE PREFERENCE for all lower-case letters
Make sure canonical tags reflect the desired version
Any new pages published on the site should adhere to the standard variation