Bad Credit Mortgage - Flights - Pacotes Reveillon Salvador - Remortgages - Flights

PDA

View Full Version : Solution to Getting your PHPBB Forum Spidered - Part 1


misohoni
Jul 4th 2004, 8:50 pm
Been doing a bit of searching and tested and found:

- A list of Bots to spider your site effectively.
- A way of stopping Session ID's
- Creating pages which can be easily spidered.



Ok, lets do this step by step (recommend backing up all files):


CHECK IF YOUR SITE HAS SESSIONS

Go to http://www.tools.summitmedia.co.uk/spider/ to check how your site is spidered. You should see the Session Id's there and perhaps unlinkable pages?


STOPPING SID'S (Guests won't be able to post unless registered, but can't find a better way of stopping SIDS)

#
#-----[ OPEN ]------------------------------------------
#
includes/sessions.php

#
#-----[ FIND ]------------------------------------------
#
$SID = 'sid=' . $session_id;

#
#-----[ REPLACE WITH ]------------------------------------------
#
if ( $userdata['session_user_id'] != ANONYMOUS ){
$SID = 'sid=' . $session_id;
} else {
$SID = '';
}
#---[EOM]----


GETTING THE SITE SPIDERED

See the post above above and add these spiders to the list in sessions.php:

//
// robots array all in lower case (feel free to add more robots)
//
$seRobots = array(
'almaden.ibm.com',
'appie 1.1',
'architext',
'ask jeeves',
'asterias2.0',
'augurfind',
'baiduspider',
'bannana_bot',
'bdcindexer',
'crawler',
'crawler@fast',
'docomo',
'fast-webcrawler',
'fluffy the spider',
'frooglebot',
'geobot',
'googlebot',
'gulliver',
'henrythemiragorobot',
'ia_archiver',
'infoseek',
'kit_fireball',
'lachesis',
'lycos_spider',
'mantraagent',
'mercator',
'moget/1.0',
'muscatferret',
'nationaldirectory-webspider',
'naverrobot',
'ncsa beta',
'netresearchserver',
'ng/1.0',
'osis-project',
'polybot',
'pompos',
'scooter',
'seventwentyfour',
'sidewinder',
'sleek spider',
'slurp/si',
'slurp@inktomi.com',
'steeler/1.3',
'szukacz',
't-h-you-n-d-e-r-s-t-o-n-e',
'teoma',
'turnitinbot',
'ultraseek',
'vagabondo',
'voilabot',
'w3c_validator',
'zao/0',
'zyborg/1.0',

misohoni
Jul 4th 2004, 8:52 pm
This is Part 2:

CREATE A ROBOTS.TXT FILE

This file stops spiders from accessing certain areas of your forum. Create a simple robots.txt file and save in the root. My Forums were held in "forums/", you should change this to your directory name. The robots.txt file should contain:

User-agent: *
Disallow: forums/admin/
Disallow: forums/attach_mod/
Disallow: forums/db/
Disallow: forums/files/
Disallow: forums/images/
Disallow: forums/includes/
Disallow: forums/language/
Disallow: forums/templates/
Disallow: forums/common.php
Disallow: forums/config.php
Disallow: forums/glance_config.php
Disallow: forums/groupcp.php
Disallow: forums/memberlist.php
Disallow: forums/modcp.php
Disallow: forums/posting.php
Disallow: forums/printview.php
Disallow: forums/privmsg.php
Disallow: forums/profile.php
Disallow: forums/ranks.php
Disallow: forums/search.php
Disallow: forums/statistics.php
Disallow: forums/tellafriend.php
Disallow: forums/viewonline.php
Disallow: /your-forum-folder/sutra*.html$
Disallow: /your-forum-folder/ptopic*.html$
Disallow: /your-forum-folder/ntopic*.html$
Disallow: /your-forum-folder/ftopic*asc*.html$

misohoni
Jul 4th 2004, 8:56 pm
I can't insert the next lines of code since this forum limits me (you don't get this in PHPBB, haha).

If you want the rest of the code, email me...

alsenor
Jul 4th 2004, 10:32 pm
It looks like something very useful, but I would need instructions what to do with it!

misohoni
Jul 5th 2004, 6:30 am
hmm, I don't understand your question. My instructions tell you what to do with it! You need to open the relevant PHPBB files and edit them

Nitin M
Jul 5th 2004, 8:53 am
Here are some more links to SEO'ing phpbb:

http://www.computerbb.org/about580.html
http://www.able2know.com/forums/about15132.html

There are lots of different levels you can take it. I think on my boards at seopark ( http://www.seopark.com/forums ) I have taken it is far as it can go...

1) limit front-end to only links and content that we want the spiders to follow when user is not logged in.

2) modified links to be dates on main forum page if user is not logged in.

3) make all links that are spiderable appear as static using isapirewrite

I played with just about all the mods listed above and then customized based on my own requirements. If you get stuck or need help, you can PM me.

alsenor
Jul 5th 2004, 2:06 pm
hmm, I don't understand your question. My instructions tell you what to do with it! You need to open the relevant PHPBB files and edit themYou know that it is hard for a person on any level of expertise to imagine that his directions are anything but totally clear. However, we do know better, don't we?
Attempting to go step by step, I went to summitmedia and entered my URL. Got a few pointers of value, and fixed them. Then I came back for the next step:
TOPPING SID'S (Guests won't be able to post unless registered, but can't find a better way of stopping SIDS)

#
#-----[ OPEN ]------------------------------------------
#
includes/sessions.php and have no idea what to do with this! You see, to start with one has to understand the terms (stopping SIDS? what are SIDS - do I have those, and do I need to stop them, and where are they?).
At that point, as so soften, the instruction session is over, because you speak a foreign language.
Sorry, but this is no citicism. I know exactly how it comes about, because I am doing it to people now and then. You know exactly when you lost them, by the glazed look in their eyes.
Like generation gaps, there are IT gaps, or whatever we want to call this.
http://forums.digitalpoint.com/images/icons/icon12.gif

debunked
Jul 5th 2004, 4:06 pm
SIDs - Session ID's

alsenor
Jul 5th 2004, 4:25 pm
Thanks - that sounds reasonable, but are the instructions clearer now?

stephfoster
Jul 5th 2004, 8:39 pm
It's step by step. If your host doesn't have a way for you to edit files directly from your file manager, you'll need to use FTP to download to your machine, open a text editor, make the changes and upload the updated file. To make the changes, you need to open your sessions.php file in the includes directory, find the appropriate text and replace it as shown.

I've done this kind of modification to my forums, and even though I could edit directly through my file manager, I found it simpler to drop the text into NotePad and use the find function to locate the areas that need to be changed. There can be a lot of information in these files, and finding the right place to make the changes can cause a headache.

misohoni
Jul 5th 2004, 8:55 pm
Alsenor, my instructions weren't written for a complete novice, but for users with some common knowledge of HTML works. Editing PHPBB a forum is pretty difficult!

I edited the files using Dreamweaver and did a "Find and Replace".

If you have any problems with PHPBB, I'd recommend taking a look at www.phpbb.com/forums/ or www.phpbbhacks.com

alsenor
Jul 5th 2004, 9:56 pm
Well, a complete web design novice I am not, but I have not done any php work yet. Have to find time to go through Kevin Yank's book first, which I've planned for a long time already.

My forums are asp & access based, and I did a lot of mods in it: http://www.ggholiday.com/bg/FORUMS/default.asp

My Game and Adult sites are mostly htm and asp pages:
The Battle Group: www.ggholiday.com/bg/
Adult: www.erotical.list4.us/
(Sorry, I am not allowed to post live links yet)

All the same, although editing files on my servers is no problem, I still have no idea what you are getting at - in principle. Session ID is something I don't know about.
Indulge me, please!

misohoni
Jul 5th 2004, 11:02 pm
So you haven't even got PHPBB installed?

alsenor
Jul 6th 2004, 7:56 am
php and mySql are installed, as per Kevin's tutorial.
But I didn't have time yet to go into the juicier parts of the book.

Help Desk
Jul 6th 2004, 8:16 am
phpBB should really just add this to the default installation.

alsenor
Jul 6th 2004, 8:22 am
You mean they should install automatically as per Kevin Yank's instructions? That may be taking it a step too far, since there are many variations of users.

Help Desk
Jul 6th 2004, 9:37 am
You mean they should install automatically as per Kevin Yank's instructions? That may be taking it a step too far, since there are many variations of users.

The phpBB dev crew should either set the default install to be able to be spidered or create a flag/checkbox in the administration panel. If you have a forum that you don't want spidered, you should use the appropriate robots.txt file.

alsenor
Jul 6th 2004, 10:01 am
I am much more ignorant about this sybject than you realize. What is phpBB?
I suppose a bulletin board. I am only familiar with Snitz, which mine is based on.
http://www.ggholiday.com/bg/FORUMS/default.asp

stephfoster
Jul 6th 2004, 10:54 am
Ok, there's the problem. The instructions are meant for phpBB. If you're wanting to get other kinds of forums spidered, you'd have to find out how to remove session IDs for them. I don't know anything about the kind of forums you have installed, so I can't help you there.

Now, for other kinds of php files, removing session ids is only something you need be concerned with if you're doing something that uses them. If not, don't worry about it.

alsenor
Jul 6th 2004, 11:13 am
Snitz boards are a fine piece of work, but based on asp.
They also have an excellent support group: http://forum.snitz.com/forum/default.asp

Since I plan to work with php soon, I might as well find out now about phpBB - where can I get it?

BTW, I think this board here is a fine design as well!

Arnica
Jul 6th 2004, 11:27 am
You can get phpBB from http://www.phpbb.com/

Help Desk
Jul 6th 2004, 12:07 pm
...BTW, I think this board here is a fine design as well!

vBulletin is better than phpBB. However, phpBB is free and a new better version is promised (with no timeline however). vBulletin costs $85 a year or $160 to keep the same version forever.

alsenor
Jul 6th 2004, 12:11 pm
Thanks, I see it is also open source. I will download it as soon as I can.
Also has a good community it seems.

alsenor
Jul 6th 2004, 1:06 pm
vBulletin is better than phpBB. However, phpBB is free and a new better version is promised (with no timeline however). vBulletin costs $85 a year or $160 to keep the same version forever.Yes, and it may not be $160 bucks worth better.

alsenor
Jul 6th 2004, 2:55 pm
Now, to return to the initial topic of this post... ;)

Since backlinks apparently are of certain value to search engines it is desirable to have those links counted in BBs, right?

And somehow your topic was dealing with making certain that the BB would be crawled?
Are BBs and boards not usually crawled like other sites?

Arnica
Jul 6th 2004, 3:27 pm
Now, to return to the initial topic of this post... ;)

Since backlinks apparently are of certain value to search engines it is desirable to have those links counted in BBs, right?Yes all backlinks are good backlinks.

And somehow your topic was dealing with making certain that the BB would be crawled?
Are BBs and boards not usually crawled like other sites?Yes and Yes. Just like any other page created dynamically and using excessive query strings e.g.'?ID=@@@34323244554333333333rrxyz&dat=6777979252&ind=blablabla' there is a danger that they will be ignored by search engines. The mods mentioned here make the phpBB post URLs spiderable.

Mick

alsenor
Jul 6th 2004, 4:16 pm
Ah, now I see what this is all about! Thanks for the clarification, Arnica.
Hard to imagine that our electronic monsters get weary of reading long addresses! :D
In any event, one can only hope that the admin of the boards we post in do those mods!

Lee Rees
May 3rd 2006, 9:25 am
I followed theese instructions as suggested.

2 days later my forum totally disappeared from google,

With no google listing at all, im basically going to have to close my website down and start totally from stratch, that's 6 months of work promoting it down the pan. And before you ask no i dont backlink, clone etc

I know that your only trying to help, but please can you test your mods before you start giving them out. Some of us depend on phpbb as a source of income.

misohoni
May 3rd 2006, 10:39 am
Nothing to do with the help I offered. Of course this code was tested...

Sorry to hear your site was (permanently?) removed from Google, can't see how it's connected to the above coding though - esp in such a quick time frame.

I've advanced on from this tutorial as I found Phpbb restrictive, so now use SWF with extra mods for SEO

tech86
May 9th 2006, 2:40 pm
There is a phpBB mod that has been available for almost a year now that comes with clear concise instructions to do the above mentioned and more. Check the phpBB mod database.

misohoni
May 9th 2006, 6:18 pm
Yep I guess there probably is, so would recommend users to check out the updated instructions. Please remember I first made thispost back in 2004!

Sythe
May 11th 2006, 6:14 am
If you want your forum to get spridered (which you should unless you are running something illegal) then I recommend only following the parts of this guide which show you how to remove session id's for guest user's. Also you may wish to disallow search.php and memberlist.php in robots if your server isnt particularly fast.

As for the guy above who claims he disapeared from google and therefore has to start from scratch - rubbish. Delete your robots.txt and wait half a month. I bet you you'll be back in google with your orginal position.