I am developing a new bot that does article directory submission for a variety of article directory platforms and I am coming across a common issue. When an article gets approved sometimes it goes unnoticed by the spiders and never shows up cached. Keep in mind that all articles are synonymized and are never 100% alike, so dupe content shouldn't be an issue. I thought of solving this problem in a couple of different ways but each have their drawbacks. The first possible solution would be to linkline (or possibly linkwheel) the articles and bookmark the beginning of the line. The problem with this would be that I don't know the URL of each article until it is approved by the moderators (not always the case - sometimes I can guess the future URL), but theoritically if there is a bad URL, then I have to edit the article and correct it. I have 1000 article directories and that's too painful. The second possible solution would be to bookmark (digg, jumptags etc) each approved article. This may work, but you would need several accounts, several email addys, and well, that is not the lazy man's approach. The third possible solution would be to have the bot continuously check for non-cached versions of approved articles, then create an html file that you upload to an already indexed website. Eventually, the spiders will see the html file, then follow through to the various un-cached articles. I like the third solution the best because it is the easiest from a programming standpoint. If all links to each article come from one source (the html file), is it possible that this could hurt my SEO since the basic structure of the linking pattern would be: One html file pointing to many articles that point to one website. The HTML file would reside on a different site that I am trying to SEO. Anyways, I hope that wasn't too confusing, but I am looking to see what you guys think.
Well, i do think the same and i have also tried generating some backlinks for that specific page but it did not helped me much. I'll give a try to the 3rd solution mentioned in the post.
I wouldn't be so sure... Google has a patent on near duplicate detection... It doesn't have to be exact to be considered duplicate. They can detect that page X is 39.622% duplicate of page Y. Building followed backlinks to your "spun" article from relevant web pages that are already indexed is ALWAYS a good idea... for ALL articles, not just spammy "spun" articles. This will help to keep your articles indexed AFTER they move off of the recent articles page at the submission site and are only accessible via links deep in the archives of the submission sites.
True enough about the duplication. It really depends on how much effort you put into spinning. 6+ synonyms for almost every word in a 500 word article should be fairly sufficient. If I have 200 uncached articles, how would I go about getting them indexed?