Here is where I need the support of the community, your input could assist me ensuring the best possible solution is developed. 1) The tool will be server based and not PC (download). 2) The tool will check the PR for the domains and the urls on that domain that the admin uploads (me) and filter out all the non PR with the PR. 3) I will only be manually checking and uploading to database confirmed sites that are dofollow with the option of commenting. 4) Basic features will be keyword search in content (default) title (optional). 5) Consequently on the completion of the search you will not be presented with a list of nofollow/dofollow but just dofollow list that have been manually checked and approved. I have seen these solutions built using Google custom search, but I am hoping to start with 10,000 pages just to get the service going and then quickly build it to 1,000,000 pages. I have already started sorting out the list into categories and believe these are sufficient to be theme based dofollow backlinks. I guess what I am attempting to do is build a solution that combines a few things that I already do and automate what I can, some things have to be done manually to achieve maximum results. Would like your thoughts, insights into this please so I can develop the best possible application.
Point 5 on manually checked and approved. Can you automate the process for checking the above mentioned requirements? Real interested in this proggie though.
Maybe also separate out what type of website or backlink you can get. General website, blog, forum etc. Some would be more interested in blogs, easier to post comments. While others will take the time to get links on general sites, more authoritative.
Is this a firefox extension? I can achieve most of those outcomes already with separate tools, but thanks for the link it may simplify things for me. I am not sure of the similarities. This tool intends to dig deep into any dofollow domain and pick out the PR of each and every page, I will then have a team that will manually check through those links and add them to the database. We will also comment on the sites to see the results, the problem with other dofollow tools is that the guess is estimated since they need to rely on having at least one comment on the page to assess if it is nofollow or dofollow. This solution will only have nofollow commenting. I think since all the process is manual we can separate the pages between blogs, forums, directories, etc, also should be able to state whether the comments are instant approved or moderated.
I think once you assess that a page url is: www.mysite.com/blog/content.html then you can estimate that all pages after: www.mysite.com/blog/xxxxxx will be pages you can comment on. I could have a solution that flags up any url by the user that can be checked if it is not dofollow or you can not comment on. Otherwise we will get all the URLs checked manually
Looking forward to seeing this come to fruition, nice thread and one that I have allready bookmarked. Im sick of alot of the tools out there today. It would be nice to see a good one come out here.
Thanks, hopefully it will fulfil some expectations. The coder is awesome and hope to be testing at the end of the work.
Usually, the way I check to see if a site is a nofollow or dofollow is to pull up the source page, click Ctrl+F, and type in "nof". If the word "nofollow" pops up, then you know.
Good luck with this... my partner has developed a bot that crawls the web in search of niche related sites that allow dofollow commenting on. Its not such a hard task if you use the appropriate google search strings you are able to populate hundreds of blog / forums that allow you to leave you remains behind. I think its a little over kill to manually check each page, that would take a lot of man power, on the same token manually checking each page makes no sense when you can have a computer check it for you! Either way just throwing my 2 cents into the picture... good luck!
I made a similar script using PHP hosted on my apache server, I modified the execution time of PHP scripts. The program crawls google within a search term typed by the user. Once gathered the search results the script visits each result and seach for posts and therefore analyze the outbound links, I am planning to add the PR of each page, but the main issue here is how to know how many nofollow links I should discard in order to categorize a blog as dofollow, I made many tests and some blogs have a lot of nofollow links but follow comments and others had few nofollow links and nofollow comments. Also to categorize which links are from comments is difficult because each blog is different. How is your project doing?
curl fetch apps (php) you can google about it, and there's thousands of scripts/codes to ply with for this app. I've coded one as a google fetch exploit running outside their API as a web scrapper/data miner, though only utilized for ethical reasons. here's a great tutorial to get you started: digeratimarketing.co.uk/curl-page-scraping-script ROOFIS
As nikomaster is saying and in my earlier post you can scrap google results for blogs comment pages and utilizing curl and some filter code isolate the comment table in page source. your bot can then analyze the outbounds for 'follow'/'nofollow' and export. Here's a scrapper I started out with learning-computer-programming.blogspot.com/2008/06/basic-web-scraping-google Arvind also includes a virtual browser bot script (in his comments section) which prevents google thinking it's a bot query. hope this helps, ROOFIS