Instructions on how to get billions of pages indexed in major search engines, including Google.
Before I go into the method I had personal experience with trying to do just that (get millions of pages of a client’s website indexed by Google and then Yahoo/MSN). But we had limited success with it even when we used Google Sitemaps - we’d get up to 2 million of the clients’ 11 million pages indexed then it would go back to 167 pages, with no explaination of why.
After a year of pulling out our hair trying to figure out what Google wants (including speaking personally to the head of Google Sitemaps program at Webmasterworld Pubcon Boston in April 06) we’re still down to a fraction of the pages this client has and wanted to get indexed. In the last 16 months we even used three types of sitemaps, a plan A, B and Google Sitemaps - but to no avail. We used different titles, keywords and descriptions and yet got the impression that Google crawler still thinks the pages are too much alike - but no actual diagnostics exist to tell us if that’s true or not nor will Google confirm or deny it.
Now word comes to sites that are getting BILLIONS of pages indexed and using those pages for Google AdSence! And getting the pages indexed in a couple of weeks! And the pages also working for MSN and Yahoo. You have to wonder; I certainly do.
"Check out this site: search of eiqz2q.org — depending which datacentre you hit, you will see between 3.8 and 5.5 BILLION RESULTS. Even worse… the domain is EIGHTEEN DAYS OLD. That’s right, in under 3 weeks, one person has managed to get one domain 5 billion pages indexed in Google. And they are ranking, too. That particular domain has an Alexa ranking of under 7,000. Another domain owned by the same person, t1ps2see.com, has between 1.7 and 2.4 billion indexed pages and an Alexa ranking of under 2,000… after 4 weeks. Coincidentally, the sites also have 3 blocks of Adsense ads on each page. I wonder how much that one person is earning per day with billions and billions of pages indexed and ranking?"
I don’t know if the AdSense ads has anything to do with why both sites got indexed so quickly and so well; maybe it would be good if that issue were distinct from the method which is described here (thanks to this poster)
- •Register a meaningless domain consisting of numbers, letters, and secret symbols.
- •Setup a server to manage all of your domains and subdomains. It will need to be beefy as you will be serving a lot of traffic in a few days.
- •Buy as many article databases as you can. Topic doesn’t matter. You might want to search and replace some i’s and 0’s for corresponding ASCII codes to help you avoid duplicate content.
- •Create or buy a common scraper script. You’ll need it to respond with different articles based on what keyword is hit, effectively serving up new content for each subdomain. It should respond to any subdomain query. Your server should be setup to allow all subdomains to be redirected to your main page; there your script sorts out what content to serve. Effectively allowing you to create an infinite number of subdomains with unique content. Now the trick this guy is using involves subdomains of subdomains. So you create a “topical” subdomain such as music.3hid9gw.org and then on that subdomain you create your actual pages of additional sub-subdomains, like: 2152.music.3hid9gw.org. Because each subdomain and each sub-subdomain is considered a new site by Google, you can get past the “1 page indexed per site” delay for new domains. If you don’t get this part, hire someone… according to the Alexa traceback, it looks like Argentina has the right people for the job.
- •Launch your blog comment spam attack. Link to some of your subdomains which are also interlinked.
- •Wait a few weeks… then sit back and enjoy your billions of indexed pages. Be sure to put 3 Adsense or YPN blocks on each page.
The key point is this method provides unique content for each page, something that all my work with our client did not do, even though we thought we were following Google’s instructions, they have other bias that are not so easy to understand, such as a threshold of duplicate content where they don’t index or drop out the pages after they index them - which is probably what is happening to my client.
No doubt, this technique will be filtered out of the index eventually as Google figures out what happened.