Been doing a bit of searching and tested and found: - A list of Bots to spider your site effectively. - A way of stopping Session ID's - Creating pages which can be easily spidered. Ok, lets do this step by step (recommend backing up all files): CHECK IF YOUR SITE HAS SESSIONS Go to http://www.tools.summitmedia.co.uk/spider/ to check how your site is spidered. You should see the Session Id's there and perhaps unlinkable pages? STOPPING SID'S (Guests won't be able to post unless registered, but can't find a better way of stopping SIDS) # #-----[ OPEN ]------------------------------------------ # includes/sessions.php # #-----[ FIND ]------------------------------------------ # $SID = 'sid=' . $session_id; # #-----[ REPLACE WITH ]------------------------------------------ # if ( $userdata['session_user_id'] != ANONYMOUS ){ $SID = 'sid=' . $session_id; } else { $SID = ''; } #---[EOM]---- GETTING THE SITE SPIDERED See the post above above and add these spiders to the list in sessions.php: // // robots array all in lower case (feel free to add more robots) // $seRobots = array( 'almaden.ibm.com', 'appie 1.1', 'architext', 'ask jeeves', 'asterias2.0', 'augurfind', 'baiduspider', 'bannana_bot', 'bdcindexer', 'crawler', 'crawler@fast', 'docomo', 'fast-webcrawler', 'fluffy the spider', 'frooglebot', 'geobot', 'googlebot', 'gulliver', 'henrythemiragorobot', 'ia_archiver', 'infoseek', 'kit_fireball', 'lachesis', 'lycos_spider', 'mantraagent', 'mercator', 'moget/1.0', 'muscatferret', 'nationaldirectory-webspider', 'naverrobot', 'ncsa beta', 'netresearchserver', 'ng/1.0', 'osis-project', 'polybot', 'pompos', 'scooter', 'seventwentyfour', 'sidewinder', 'sleek spider', 'slurp/si', 'slurp@inktomi.com', 'steeler/1.3', 'szukacz', 't-h-you-n-d-e-r-s-t-o-n-e', 'teoma', 'turnitinbot', 'ultraseek', 'vagabondo', 'voilabot', 'w3c_validator', 'zao/0', 'zyborg/1.0',
This is Part 2: CREATE A ROBOTS.TXT FILE This file stops spiders from accessing certain areas of your forum. Create a simple robots.txt file and save in the root. My Forums were held in "forums/", you should change this to your directory name. The robots.txt file should contain: User-agent: * Disallow: forums/admin/ Disallow: forums/attach_mod/ Disallow: forums/db/ Disallow: forums/files/ Disallow: forums/images/ Disallow: forums/includes/ Disallow: forums/language/ Disallow: forums/templates/ Disallow: forums/common.php Disallow: forums/config.php Disallow: forums/glance_config.php Disallow: forums/groupcp.php Disallow: forums/memberlist.php Disallow: forums/modcp.php Disallow: forums/posting.php Disallow: forums/printview.php Disallow: forums/privmsg.php Disallow: forums/profile.php Disallow: forums/ranks.php Disallow: forums/search.php Disallow: forums/statistics.php Disallow: forums/tellafriend.php Disallow: forums/viewonline.php Disallow: /your-forum-folder/sutra*.html$ Disallow: /your-forum-folder/ptopic*.html$ Disallow: /your-forum-folder/ntopic*.html$ Disallow: /your-forum-folder/ftopic*asc*.html$
I can't insert the next lines of code since this forum limits me (you don't get this in PHPBB, haha). If you want the rest of the code, email me...
hmm, I don't understand your question. My instructions tell you what to do with it! You need to open the relevant PHPBB files and edit them
Here are some more links to SEO'ing phpbb: http://www.computerbb.org/about580.html http://www.able2know.com/forums/about15132.html There are lots of different levels you can take it. I think on my boards at seopark ( http://www.seopark.com/forums ) I have taken it is far as it can go... 1) limit front-end to only links and content that we want the spiders to follow when user is not logged in. 2) modified links to be dates on main forum page if user is not logged in. 3) make all links that are spiderable appear as static using isapirewrite I played with just about all the mods listed above and then customized based on my own requirements. If you get stuck or need help, you can PM me.
You know that it is hard for a person on any level of expertise to imagine that his directions are anything but totally clear. However, we do know better, don't we? Attempting to go step by step, I went to summitmedia and entered my URL. Got a few pointers of value, and fixed them. Then I came back for the next step: and have no idea what to do with this! You see, to start with one has to understand the terms (stopping SIDS? what are SIDS - do I have those, and do I need to stop them, and where are they?). At that point, as so soften, the instruction session is over, because you speak a foreign language. Sorry, but this is no citicism. I know exactly how it comes about, because I am doing it to people now and then. You know exactly when you lost them, by the glazed look in their eyes. Like generation gaps, there are IT gaps, or whatever we want to call this.
It's step by step. If your host doesn't have a way for you to edit files directly from your file manager, you'll need to use FTP to download to your machine, open a text editor, make the changes and upload the updated file. To make the changes, you need to open your sessions.php file in the includes directory, find the appropriate text and replace it as shown. I've done this kind of modification to my forums, and even though I could edit directly through my file manager, I found it simpler to drop the text into NotePad and use the find function to locate the areas that need to be changed. There can be a lot of information in these files, and finding the right place to make the changes can cause a headache.
Alsenor, my instructions weren't written for a complete novice, but for users with some common knowledge of HTML works. Editing PHPBB a forum is pretty difficult! I edited the files using Dreamweaver and did a "Find and Replace". If you have any problems with PHPBB, I'd recommend taking a look at www.phpbb.com/forums/ or www.phpbbhacks.com
Well, a complete web design novice I am not, but I have not done any php work yet. Have to find time to go through Kevin Yank's book first, which I've planned for a long time already. My forums are asp & access based, and I did a lot of mods in it: http://www.ggholiday.com/bg/FORUMS/default.asp My Game and Adult sites are mostly htm and asp pages: The Battle Group: www.ggholiday.com/bg/ Adult: www.erotical.list4.us/ (Sorry, I am not allowed to post live links yet) All the same, although editing files on my servers is no problem, I still have no idea what you are getting at - in principle. Session ID is something I don't know about. Indulge me, please!
php and mySql are installed, as per Kevin's tutorial. But I didn't have time yet to go into the juicier parts of the book.
You mean they should install automatically as per Kevin Yank's instructions? That may be taking it a step too far, since there are many variations of users.
The phpBB dev crew should either set the default install to be able to be spidered or create a flag/checkbox in the administration panel. If you have a forum that you don't want spidered, you should use the appropriate robots.txt file.
I am much more ignorant about this sybject than you realize. What is phpBB? I suppose a bulletin board. I am only familiar with Snitz, which mine is based on. http://www.ggholiday.com/bg/FORUMS/default.asp
Ok, there's the problem. The instructions are meant for phpBB. If you're wanting to get other kinds of forums spidered, you'd have to find out how to remove session IDs for them. I don't know anything about the kind of forums you have installed, so I can't help you there. Now, for other kinds of php files, removing session ids is only something you need be concerned with if you're doing something that uses them. If not, don't worry about it.
Snitz boards are a fine piece of work, but based on asp. They also have an excellent support group: http://forum.snitz.com/forum/default.asp Since I plan to work with php soon, I might as well find out now about phpBB - where can I get it? BTW, I think this board here is a fine design as well!