Your welcome (though I'm a bit concerned that I actually helped someone produce black-hat code ). The most interesting thing about it, SEO wise, is that if you feed it keyword rich text, it will produce keyword rich text (please note that I object to this form of generated content). You, and all other programmers, should also try the other challenges on Prag Dave's site, and do read up on his explanation of why you should link. The basic reasoning is that in many other areas where skill is needed practitioners spend at least some time exercising, and that this might be useful for programmers as well. I think everybody on this forum would benefit from reading the introduction of the above linked page. It can certainly be applied to most fields, not just programming.
Dont know where you are getting your results from but google reports all "0's" check at http://www.mcdar.net/q-check/datatool.asp
Its still under constructuin, we started the project a week ago, I wish It woud be finished, but there are many things that still need a lot of mind work, and also programming. There are parts working, but the main part is still in development. The test is on a testing url, and there is only 1 public page. In a week or so It will have a domain name for it. Woud be better if you read all the previous pages.
Interesting to watch this, I just fought with google with one of my new sites that had loads of pages droppped by google and sat around 100,000 indexed but today she is back up to over 800,000 so I am aiming for 1 million shortly.
it would be interesting if a number of billion page sites sprang up - how would that effect the way the guys at Google plan to spider the web? Or do you think they would just get ignored. Interesting - I'm wondereing how many pages the 'world library' they are planning would take up...
Now I'm in the hardest part of work, deciding wich will be the page structures: For those who know something of programming, I need your opinion: main idea: getting url, disglosing it, and calculating the pages tructure: url will be www.domain.com/001/001/001/001 The main idea is to use the versatility of MD5(see last lines to know whats md5 hash) hash function to ask and repply: here is what I mean: function ask_md5($what, $min, $max) { $question=md5($what); $question=crc32($question); $question=hexdec($question); //$question=hexdec( crc32(md5($what))); $answer=$min+($question % ($max-$min)); // using modulus % function echo "If u ask \"$what\", from $min to $max, computer says \"$answer\""; // return $answer; } PHP: And you can ask for anything, that will allways return the same number for the same question: ask_md5("Which number of paragraphs we will have in http://{$HTTP_HOST}{$REQUEST_URI}",3,100); PHP: Here we have the number of paragraphs depending on the url. we can do the same in a deeper way: $terms=array('noun','adj','adv','verb','4n-gram','5n-gram','6n-gram' ,'7n-gram','8n-gram','det','2n-det','verb2'); ask_md5("Type of word in paragraph 2, line 1, word 33 in http://{$HTTP_HOST}{$REQUEST_URI}",0,12); PHP: Whe can just ask as many questions as we want, and then generate the page depending on the results. I think this system will be one of the easyest ways to return the same words in the same url, but any better or smarter way to do it? Try the code, lets hear your opinion!
Are you triying to create semanticaly corect text? I want to create a good auto-text generator too, have you looked at any of the black hat tools for ideas?
I've done a little(actually about 3 yrs) homework, and learned, most search engines do see all and or most pages from a site-clue*. Theres something I know that spammers wish they knew though, I'm not gonna tell, EVER. I want to rank high thanks to all that hard work it took me to learn how to rank high the right way. All would be pissed away if spammers got ahold of such information, they'd just take over. I'm already experiencing some of the effects from spammer attacks which try ruling my keywords. If I got you all confused, sorry. To sum it up, I'm not a spammer, though I study their techniques, so in case a spammer tries taking my rank, I'll know how to go about out smarting him/her without resorting to spam. =D
Well looks as if you took over some habits of the spammers. |Your spamming the forum............ Something else isn't a forum about sharing ideas with one other. So get the story going on how you can rank well after studying search engine behaviour for three years.
I am curious as to how you are going to get each word. Are you going to do a db query on a large table of n-grams/words for each word on each page. That would be a lot of db queries.
Sorry about not actualitzating the post, I've been hard working many days, in a few days we we'll have the test site version 0000.0000001, but it will work. Yes, I'll post the source code here, and also with coments, explanations and wich was the idea for each lines of php. Maybe some of you can help improving it. I've been looking for the best way to generate the content, but the hard thing is that have to be generated with only 12 parameters (the Id of the page, in url /000/000/000/000) and with this parameters you shoud have ALL the info needed to generate always the same page with same links, same paragraphs for each URL. I think I've found the way with the function ask_md5, that gives you an answer from any question, numerically, from MIN to MAX, and its pretty fast. Have anyone of you tested the ask_md5 function? Did you understand it? (not PHP code, the essence of the function)
hi folks, I've finally got enough time to finish the first release of code that works. I've uploaded the source and added mod rewrite rules to work with styles. If you want to take a look: Here
hasn't cnn.com already done this? lol Hey is there some way I can find out how many pages a site has? Some sort of tool?
There are two ways: -A spider tool: Some script that will read a page, and follow all links, read all links from tha page and follow them. If you restrict to the same domain, you'll know how many pages does it have, but it may waste a lot of bandwith, and it will show on site statistics that someone has spidered the site, in worst case, you'll get banned for massive downloading site. -Site command in search engines, the only point is that will only show you the indexed pages of the site, that may be lower or equal to the total pages.