Technical SEO testing: How Googlebot handles iframes
Earlier this year at SMX Advanced, I presented results from our Peak Ace test lab. These tests shed some light on several technical implementation points and how Googlebot would deal with them.
One of my favorite tests examined Google’s indexing of iFramed URLs and their content. In my SMX Advanced presentation, I touched on various scenarios that may lead Google to index the content inside an iFrame, while “assigning” that content to its parent URL.
The parent URL can, in some instances, rank for content that only exists in the iFramed URL and not in the parent URL.
Naturally, this excited people – and all sorts of follow-up questions arose. Here are a few of them with my answers.
In the iFrame test, was the iFramed content coming from the same domain or a different one?
My example showed two URLs that live on the same domain:
domain.com/test.html would iFrame
domain.com/tobeframedA.html, so that
test.html could rank for content that only exists in
The same also works for
externaldomain.com/tobeframedB.html – which can still cause
test.html to rank for content only present in
tobeframedB.html, as well as for iFrames residing on subdomains. We tested every combination we could think of and concluded that it made no difference where the iFrame content was hosted.
If you want to prevent someone from loading (and ranking for) your content in an iFrame, it would be a good idea to look into the X-Frame-Options Header. This indicates whether a browser should be allowed to render a page in an iFrame.
If we were to use iFrames with a no-indexed content page, would the parent page still rank for the listed content with the intent to improve page speed?
As soon as the iFramed URL contains a meta robots noindex directive, the parent URL won’t be able to rank for the content from the iFramed URL.
The same is true if you iFrame a URL that would be served with an X-Robots noindex header directive or is actively blocked using robots.txt.
As far as page speed is concerned, iFrames support the
loading="lazy" attribute, which would defer offscreen iFrames from being loaded until a user scrolls near them. This is an elegant solution for speeding up loading times for URLs that depend on iFramed content.
Does Google give full value to semi-hidden content (content that typically comes after ‘Read more’)?
There doesn’t seem to be too much love for using “Read more” functionality within the ranks of Google. John Mueller went on record a couple of times here and here, questioning the use of the functionality in its entirety. Mueller added, “I don’t think you’d see a noticeable, direct change in SEO, […]”.
When we tested it, the purpose of the test was to understand what difference the technical implementation could potentially make – and if, in general, content behind a “Read more” would be indexed (if correctly set up).
The short answer: whether or not it was visible, the content would be indexed, found and returned.
However, content that was invisible during loading didn’t get highlighted in the snippet. The technical implementation didn’t make a difference (as long as the content was part of the HTML DOM at load), leaving you free to use
That said, in my opinion, it is impossible – due to various factors outside of our control – to create a test setup that (including results) could provide an accurate answer regarding the “full value” part of the question.
Did you mention that duplication in certain areas of the content can be fixed by CSS implementation since it is not indexed?
I did present some behavior that I find fairly interesting regarding CSS selectors. What technically happens is that selectors such as
::before create a pseudo element that is the first child of the selected element. In practice, this is often used to add cosmetic content to an HTML element.
This could also be useful from an SEO point of view because Googlebot seems to treat this just as it would treat Chrome on desktop/smartphone. The rendered DOM remains unchanged (which is to be expected since it’s a pseudo class). As a result, content from within said selectors won’t be indexed.
So, ultimately you could use this to prevent certain content from being indexed without keeping it from being displayed on the website. Maybe you have to display certain content that gets classified as “boilerplate” (e.g., shipping info, or legal info) or you want to create a certain content footprint. This opens up a great many possibilities to explore further.
Watch: Technical SEO testing in 2022: Separating fact from fiction
Below is the complete video of my SMX Advanced presentation.
Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.
New on Search Engine Land