1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

problems with JSESSIONIDs with use of Struts with Hibernate and robot crawlers

Discussion in 'PHP' started by KBee, Aug 28, 2006.

  1. #1
    Hi,

    I'm having a problem with this. Any ideas would be great.

    Our application uses Tomcat and Struts with Hibernate to deliver dynamic web pages. The JSP pages use the c:url tag to automatically append JSESSIONID to links if the client cookies are disabled. For SEO purposes, we’ve tried several approaches to handling JSESSIONIDs so that search engines do not get incorrectly weight pages based on the transient JSESSIONID values. Ultimately, we’ve had to remove all the c:url tags so that the JSESSIONIDs are not generated and add cloaking to our Apache webserver to strip the JSESSIONID from incoming links. However, this has turned out to be a nightmare. The problem is that a bot like Google can visits our website over a thousand times in an hour. Add this to other bots, and we’re in a situation where search bots account for a significant fraction of the total traffic to our website. That’s normally fine; however, when we remove the JSESSIONIDs from the links, a robot crawler jumping from page to page on our website will be interpreted as a new unique user. Every new unique user is automatically assigned a session object by the servlet container. In addition, the use of Struts with Hibernate make it a requirement to have a session object available (since hibernate uses a unique session id to track persistent connections). Our system setup combined with removing the JSESSIONIDs from all urls on the website overwhelms our servlet container because new session objects are spawned every single time a bot hits our website. The only way to handle this problem, is to put the session-timout setting in the web.xml file down to a very small value (e.g. 1 minute), but then this creates the additional problem of our users inadvertently being logged out after a minute of inactivity. There seems to be no solutions to this problem. If I cater to the search bots, I screw the users.

    And vice versa. Any suggestions?
     
    KBee, Aug 28, 2006 IP