Using Scrapebox to Find Expired Domains
13 votes, 2.38 avg. tacos (49% full)

Note that this is part of the expired domain series which has been compiled in order of importance here.

 Update: Although PBNs still work, they now have a history of being targeted by Google and therefore may not be the safest option. This is why we now focus on creating online businesses that are independent of SEO traffic.

This was originally part of a private course for interns, but I’ve decided to open it up to the world.  It was written by ID, one of the original interns who headed a group of V2 Interns.

 

<Disclaimer> A lot of people are having difficulty finding domains. There is a real learning curve, so please be prepared for this. As an option for those that are just starting out, I recommend you prove the model by first purchasing a few domains, or try out our latest service, RankHero.

RankHero allows you to test the viability of this method without having to invest the hundreds of hours and thousands of dollars it takes to find, build, host and maintain your own Private Content Network. </disclaimer>

 

 

Hayden showed some excellent methods of using Scrapebox in his video.

These techniques of Hayden’s are the basis — the starting point — of using Scrapebox for finding domains.  Your task is to take Hayden’s methods and add some creative tweaks of your own.

Here are some suggestions to get you going:

* Use the Scrapebox Link Extractor to do an internal crawl then an external one.

After you’ve found some high-PR domains, do an internal link extract on them. This will find all the pages on the site that the homepage links to. Why do this? Well, if your root domain has a PR 7, all the pages it links to will be a PR 5.

Next, use the Link Extractor again, this time finding external links. To continue the example from above, all the sites to which the PR 5 pages link will be PR 3’s.

*  Newspapers tend to have some authority and PR.

I found a website that listed websites for the major newspapers, and then created a doc that looked like this:

site:sacbee.com

site:sfchronicle.com

site:denverpost.com

site:latimes.com

And so forth.

I load this doc into the keywords area of the Harvester, I type in a footprint that is sure to include -2012 -2011 -2010 -2009 -2008, and start my harvest.

* Focus on your queries in the Harvester. Enter queries that will bring up high authority sites, old sites, popular sites, well-respected sites. What can you type up there to bring up sites with these characteristics?

* Understand how search engines work. Scrapebox passes your search queries on to Google, Yahoo, and Bing, so use those search engines’ operators to hone or broaden your search.

As a simple example, the operator “site:” will specify what site or sites the search engines will examine. “No Hat site:.edu,” therefore, will search for the terms “no” and “hat” on .edu sites.

This link gives a great, concise overview of Google’s search operators.

A particularly useful one here is the tilde (~), which will search synonyms for you. “~vehicle” will search airplane, car, automobile, moped, rickshaw, and so forth.

* Have a look at Scrapebox’s pre-set footprints. Obviously, these have all been used before. But how can you tweak them a bit to bring up new lists that others have not created?

Site:.edu is a great starting footprint.  You can add your search term and tweak it by specifying dates.  You’ll get nice results.

*  “Subtracting” the year with -2012 -2011 -2010, etc., is one of Hayden’s tried and true techniques that we interns have implemented with great success.

But to vary your results, try different queries.  Sometimes I will add a keyword list to the Harvester that looks like this:

intitle:2004

intitle:2005

intitle:2006

intitle:2007

And so on.

* Scrapebox will harvest only 1,000 results per query. (I think this limitation is inherited from search engines, which will themselves stop showing results after a thousand.) But by tweaking your search a bit, you can keep the results coming. So for example, search “baseball cap” and then “baseball hat” and then “baseball helmet.” Many more results.

(Of course, on this site we like to keep things a bit more hatless.)

*  Just a note that Hayden wanted me to mention:  the Alive check in Scrapebox.  This tool is meant to determine whether or not live websites are associated with domains.

I used it for a while, thinking that it would clue me in to good expired domains.  I would take all the sites the checker listed as “dead” and run them through name.com to determine their availability.

I did some testing, though, and I found that the Alive check didn’t do such a great job.  Many sites it listed as “alive” were actually available for registration.  Many “dead” sites were, you guessed it, alive.

There are various error codes you can play around with in the Alive check.  You can have it deem 404, for instead, to mean dead.  Personally, I didn’t find the right codes to help me in my domain searches.

Hayden pointed out that some people do like the Alive check’s results — and they therefore use it.

* (This suggestion comes courtesy fellow post-grad “Doomsday”): Expired software applications were coming up as premium XGLs on my lists, and I thought to explore the personal blogs/sites of the developers themselves who didn’t take down the link. The psychology behind this might be that a developer’s past projects are an important part of like their CV, and leaving the link is perhaps a matter of pride.

inurl:.edu inurl:blog. -2012 -2011 -2010 -2009 -2008
computer science
code
programming
project
projects
software
source

I repeat the process changing the “blog.” in the footprint to “site”, “personal”, “user/”, “member/”, etc. Then repeat all of the above using “.gov.nz” instead of “.edu”. After removing duplicates I have over 100k URLs.
I might clean the list a bit to remove ccTLDs or spam words, but otherwise leave the URLs as is. I then Link Extract them in SB, or Xenu the lists at depth = 2. (I suggest no more than 10K URL lists at a time in Xenu). If you went straight to Xenu, clean all the 12007s down to the root domain and check their availability. I use name.com for this, as it can handle 7-8000 domains of the same TLD at a time. If you did a Link Extract in SB first, then Alive check the extracted links in SB, export the deads, clean them, and availability check on name.com.

*  Use the “merge button” (a capital ‘M’ at the upper right) to maximize lists of good keywords.

For instance, I will load my “intitle:YEAR” list into the keyword box, and then I will merge it with a list of newspaper sites I mentioned it above.  Then, I will add a keyword — such as “health resources” — into the footprint area at the top.  The resulting harvest will look like this:

“health resources” intitle:2004 site:sacbee.com

“health resources” intitle:2005 site:sacbee.com

“health resources” intitle:2006 site:sacbee.com

“health resources” intitle:2007 site:sacbee.com

“health resources” intitle:2004 site:sfchronicle.com

“health resources” intitle:2005 site:sfchronicle.com

“health resources” intitle:2006 site:sfchronicle.com

“health resources” intitle:2007 site:sfchronicle.com

“health resources” intitle:2004 site:denverpost.com

“health resources” intitle:2005 site:denverpost.com

“health resources” intitle:2006 site:denverpost.com

“health resources” intitle:2007 site:denverpost.com

And so forth.

* Explore Scrapebox. Check out all the add-ons. Play with all the buttons. There is a lot there. Once you’ve figured the software out, mess around with new ways to use it. Find methods that even Hayden and other experienced SEO’s aren’t using.

And that leads me to:

* Invent your own techniques. Learn the ones listed here, replicate them, and then get creative with them!

Creativity is really paramount here. You want to find domains that other haven’t — so you need to do Scrapebox searches that other people haven’t.

Happy harvesting.

Using Scrapebox to Find Expired Domains
13 votes, 2.38 avg. tacos (49% full)
  1. So as far as the scrapebox footprint “major newspapers” example. It would be formatted like so?

    site:thepaperboy.com -2010 -2011 -2010 -2009 -2008
    site:dailynews.com -2010 -2011 -2010 -2009 -2008
    site:azcentral.com/arizonarepublic -2010 -2011 -2010 -2009 -2008

    Is this correct?

  2. You have two -2010’s there. I think you meant 2012.

    Otherwise, yep, that’s correct.

    Of course, you also want a search term in there, too. Like health resources, etc.

  3. My broken link building method for finding domains

    1. Find authority sites that sites would naturally link to (.gov/.edu/.org /.us)
    2. Get their backlinks via Majestic
    3. Export csv
    4. Scrape external links from those pages using IMSimpler
    5. Use Scrapebox to trim to root (I’m building a marco to remove this step)
    6. Check for available domains using your favorite tool. I use +A Bulk Domain Checker.
    7. Get DA and PA via excel with SEOmoz Free API (I can share how to do this if you have a API)
    8. Dump any domains with DA<20
    9. Get referring IPs and Trust Flow from Majestic
    10. Dump any obvious crap domains based upon Majestic metrics
    11. Use paid SEOmoz API to get MozTrust

    You could do steps 7,8,9 and 10 using Hayden's free API tool. I don't use it, because it's faster for me to do run them through Excel. I know I will miss some domains based upon Majestic Trust Flow, but I can live with because I make it up in speed. I may change this if I start to see Majestic Trust Flow as inaccurate.

    • This still works but you really need a form of automation. Automator helps for building large lists and even without software you can still find a lot using automator + a VA doing the menial tasks like checking domains 2000 at a time and using Netpeak to check PA/DA.

  4. I am new to this technique of searching for expired domains that are available for registration and have good metrics but I really like the idea of it above any other way to generate income online. I’ll say this though, it is difficult to find these gem’s with out tools or paid services. So I want to thank you and Hayden for providing this information especially concerning the use of Scrapebox… Any further updates you have will be eagerly awaited and much appreciated.

  5. One problem I have with this method of finding domains is that most of the domains will have expired for some time already and therefore will not be indexed in Google and yet almost all the guides and tutorials I’ve read on expired domains have said to discount any that don’t show any results in Google.

  6. I would like to see an answer to Sean’s question…..

    One problem I have with this method of finding domains is that most of the domains will have expired for some time already and therefore will not be indexed in Google and yet almost all the guides and tutorials I’ve read on expired domains have said to discount any that don’t show any results in Google.

  7. I frequently buy domains that expired years ago and are no longer indexed in Google –

    if the domain has solid live backlinks and authority (which is why we buy them) then they usually get re-indexed very quickly – just be sure to put something on the domain. I’ve had domains get indexed before I could remove the hello world post.

    I’ve purchased domains that have taken 7+ days to get re-indexed before- on such domains I usually sprinkle some social links to the top backlinks of that domain and they get indexed super quick.

    I also have sites that rank very well that have their own small networks all built from expired domains that expired years ago and were no longer indexed in Google – (actually I’m not sure if they were indexed in Google or not because that isn’t even something I check anymore)

    If your discounting authority domains that expired years ago because they are not indexed then your missing out,

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>