I want that Google and possibly other search engines can index my password protected rich-content site area, and to be more specific my course area of my under construction academy http://academy.webnauts.net Can that be a reason to be punished? And can you recommend me any legal techniques if available?
All the links on your site map page are giving me 404 not found errors. You might want to take a look at that. If you are asking is there a way to get content behind the login indexed then the answer (as far as I know) is no. It is not a punishment. The spider can't get in there to see the content because it doesn't have a username or password. If it did get in and index it then the people clicking on the search engine result couldn't get in because they don't have usernames and passwords. You could make some more of your content public and then that would get indexed but, presumably there is a reason why you have the login so that may not be an option. Alternatively I have seen search engines indexing site search results which provide lists of documents matching given terms and providing an extract of each. So if you had pages that linked to extracts of information then people might find those and would have to join before they could read them. Incidentally there are a few other links that don't seem to work including login from the homepage. I would look at that first if I were you. Alastair
As I said above, the site is under construction. Not all links and other issues are working. I am sure that password protected sites can be indexed, for example using the CMS Moodle. Thanks though for your kind response.
Some sites allow search engines to crawl the content but then only allow users to see a snippet - thus gaining signups. THis is allowed for some sites (Like I say), but I would always email google...
If this is possible I'd be really interested in knowing how to do it. I have never seen it. The only results I have come across from webmaster world in Google are to forum topics and the content that was indexed is publicly available. I've never seen it in results from any other site either. I've seen sites get their internal site search results (including article snippets) indexed and then you need to login to view the rest. This seems perfectly 'legal' and is an option. I suppose you could use scripting on the page url to control who can view the page and you could detect the user agent. Showing the bot the page content and redirecting live users to a login before they can see the page. This would be cloaking though and would also make the content insecure. Five minutes of coding and you could pretend to be googlebot and suck down all the content you wanted, no? I can't see any other way of doing it. Cloaking which is probably not a good idea or providing a url that actually accesses the content (in which case it would bypass the login and be publicly available). I don't understand that. You mean that the CMS Moodle site has protected content indexed or that you can use their system to get your content indexed? P.S. Sorry I missed the under construction bit. Thought I'd better let you know about the links in case you hadn't spotted it. Alastair
Are you a member there? Try logging out. Not if you were to use IP based cloaking. You'd need a regularly updated list of Google IP's though.
Well spotted. I logged out and I still get the same results. I do a site search on webmasterworld from google and everything I see there is forum posts. When I click on the link I can view the pages as a guest. No problem. That would work alright. I'd imagine Google would come down on you like a ton of bricks if they caught you though.
Not if you OK it with them first - As I imagin Brett would have done. I don't think it's all threads that are "Cloaked" like that... I don't really have enough time to check though.
Sounds like it might be worth asking Google then. There is an article here which you might find interesting if you want protected content indexed and Danny Sullivan seems to confirm in there that GG have entered limited agreements with certain sites on various ways to index their protected contents. http://www.betanews.com/article/Google_Indexing_Subscription_Content/1120164520
Google do allow cloaking in some exceptions (Like you have mentioned). Like if a music video site wants to be searched for lyrics, then they cloak the video page and display the lyrics of the song to a search engine.
Yeah They provide different content based on your IP themselves and so do many other sites for legitimate reasons. I've heard people from Google say that they try to look at the intent behind the practice. I suppose the two questions would be are they able to do that effectively? and would they like the intent in this case? Probably google are the only folks who can answer that. I certainly wouldn't risk it without asking them. As a user of Google I would be very unimpressed if the serps were showing pages that I couldn't click through to. The article above seems to indicate that the new system they plan would deal with this by either having the first view free or separating the results. Google would, I imagine, have a legitimate interest in indexing the hidden web but I'm not sure they would be keen on that content being mixed with their main results. Who knows. Alastair