When doing a link building strategy – or any other type of SEO-related strategy, for that matter – you might find the sheer amount of data that we have available to us at a relatively low cost, quite staggering.
However, that can sometimes create a problem in itself; selecting the best data to use, and then using it in such a way that makes it useful and actionable, can be a lot more difficult.Creating a link building strategy and getting it right is essential – links are within the top two ranking signals for Google.This may seem obvious, but to ensure you are building a link strategy that is going to benefit from this major ranking signal and be used on an ongoing basis, you need to do more than just provide metrics on your competitors. You should be providing opportunities for link building and PR that will guide you on the sites you need to acquire to provide the most benefit.
The best way to do this is by performing a large-scale competitor link intersect.Rather than just showing you how to implement a large scale link intersect, which has been done many times before, I’m going to explain a smarter way of doing one by mimicking well-known search algorithms.Doing your link intersect this way will always return better results. Below are the two algorithms we will be trying to replicate, along with a short description of both.
Topic-sensitive PageRank
Topic-sensitive PageRank is an evolved version of the PageRank algorithm that passes an additional signal on top of the traditional authority and trustworthy scores. This additional signal is topical relevance.The basis of this algorithm is that seed pages are grouped by the topic to which they belong. For example, the sports section of the BBC website would be categorised as being about sport, the politics section about politics, and so on.
All external links from those parts of the site would then pass on sport or politics-related topical PageRank to the linked site. This topical score would then be passed around the web via external links just like the traditional PageRank algorithm would do authority.You can read more about topic-sensitive PageRank in this paper by Taher Haveliwala. Not long after writing it he went onto become a Google software engineer. Matt Cutts also mentioned Topical PageRank in this video here.
Hub and Authority Pages
The idea of the internet being full of hub (or expert) and authority pages has been around for quite a while, and Google will be using some form of this algorithm.You can see this topic being written about in this paper on the Hilltop algorithm by Krishna Bharat or Authoritative Sources in a Hyperlinked Environment .In the first paper by Krishna Bharat, an expert page is defined as a ‘page that is about a certain topic and has links to many non-affiliated pages on that topic’.A page is defined as an authority if ‘some of the best experts on the query topic point to it’. Here is a diagram from the Kleinberg paper that shows a diagram of hubs and authorities and then also unrelated pages linking to a site:
We will be replicating the above diagram with backlink data later on!
From this paper we can gather that to become an authority and rank well for a particular term or topic, we should be looking for links from these expert/hub pages.We need to do this as these sites are used to decide who is an authority and should be ranking well for a given term. Rather than replicating the above algorithm at the page level, we will instead be doing it at the domain level. Simply because hub domains are more likely to produce hub pages.
Relevancy as a Link Signal
You probably noticed both of the above algorithms are aiming to do very similar things with passing authority depending on the relevance of the source page.You will find similar aims in other link based search algorithms including phrase-based indexing. This algorithm is slightly beyond the scope of this blog post, but if we get links from the hub sites we should also be ticking the box to benefit from phrase-based indexing.If anything, reading about these algorithms should be influencing you to build relationships with topically relevant authority sites to improve rankings. Continue below to find out exactly how to find these sites.
1 – Picking your target topic/keywords
Before we find the hub/expert pages we are aiming to get links from, we first need to determine which pages are our authorities.This is easy to do as Google tells us which site it deems an authority within a topic in its search results. We just need to scrape search results for the top sites ranking for related keywords that you want to improve rankings for. In this example we have chosen the following keywords:
- bridesmaid dresses
- wedding dresses
- wedding gowns
- bridal gowns
- bridal dresses
For scraping search results we use our own in-house tool, but the Simple SERP Scraper by the creators of URL Profiler will also work. I recommend scraping the top 20 results for each term.
2 – Finding your authorities
You should now have a list of URLs ranking for our target terms in Excel. Delete some columns so that you have just the URL that is ranking. Now, in column B, add a header called ‘Root Domain’. In cell B2 add the following formula:
=IF(ISERROR(FIND(“//www.”,A2)), MID(A2,FIND(“:”,A2,4)+3,FIND(“/”,A2,9)-FIND(“:”,A2,4)-3), MID(A2,FIND(“:”,A2,4)+7,FIND(“/”,A2,9)-FIND(“:”,A2,4)-7))
Expand the results downwards so that you have the root domain for every URL. Your spreadsheet should now look like this:Next add another heading in column C called ‘Count’ and in C2 add and drag down the following formula:
=COUNTIF(B:B,B2)
This will just count how many times that domain is showing up for the keywords we scraped. Next we need to copy column C and paste as a value to remove the formula. Then just sort the table using column C from largest to smallest. Your spreadsheet should now look like this:Now we just need to remove the duplicate domains. We do this by going into the ‘Data’ tab in the ribbon at the top of Excel and selecting remove duplicates. Then just remove duplicates on column B. We can also delete column A so we just have our root domains and the number of times the domain is found within the search results.We now have the domains that Google believes to be an authority on the wedding dresses topic.
3 – Export referring domains for authority sites
We usually use Majestic to get the referring domains for the authority sites – mainly because they have an extensive database of links. Plus, you get their metrics such as Citation Flow, Trust Flow as well as Topical Trust Flow (more on Topical Trust Flow later).If you want to use A href or another service, you could use a metric that they provide that is similar to Trust Flow. You will, however, miss out on Topical Trust Flow. We’ll make use of this later on.
Pick the top domains with a high count from the spreadsheet we have just created. Then enter them into Majestic and export the referring domains. Once we have exported the first domain, we need to insert an empty column in column A. Give this column a header called ‘Competitor’ and then input the root domain in A2 and drag down.Repeat this process and move onto the next competitor, except this time copy and paste the new export into the first sheet we exported (excluding the headers) so we have all the backlink data in one sheet.
4 – Tidy up the spreadsheet
Now that we have all the referring domains we need, we can clean up our spreadsheet and remove unnecessary columns.I usually delete all columns except the competitor, domain, TrustFlow, CitationFlow, Topical Trust Flow Topic 0 and Topical Trust Flow Value 0 columns.I also rename the domain header to be ‘URL’, and tidy up the Topical Trust Flow headers.
5 – Count repeat domains and mark already acquired links
Now that we have the data, we need to use the same formula as earlier to highlight the hub/expert domains that are linking to multiple topically relevant domains.
Add a header called ‘Times Linked to Competitors’ in column G and add this formula in G2:
=COUNTIF(B:B,B2)
This will now tell you how many competitors the site in column B is linking to. You will also want to mark domains that are already linking to your site so we are not building links on the same domain multiple times.To do this, firstly add a heading in column H called ‘Already Linking to Site?’. Next, create a new sheet in your spreadsheet called ‘My Site Links’ and export all your referring domains from Majestic for your site’s domain. Then paste the export into the newly created sheet.
Now, in cell H2 in our first sheet, add the following formula:
=IFERROR(IF(MATCH(B2,’My Site Links’!B:B,0),”Yes”,),”No”)
This checks if the URL in cell B2 is in column B of the ‘My Site Links’ links sheet and returns yes or no depending on the result. Now copy columns G and H and paste them as values, just to remove the formulas again.