Hi I really need some help here... I have 2 websites my first site www.travelersmeeting.com is located on my server as my main site under public.html folder. My second website www.thenewagetraveler.com is on subdirectory: travelersmeeting.com/public.html/www.thenewagetraveler.com I dont know so far if that is correct but unfortunately it starts like this. Now as a result when Google craws my first site www.travelersmeeting.com is also index pages from my second site... Example: http://www.thenewagetraveler.com/China-travel-movie.html http://www.travelersmeeting.com/thenewagetraveler/China-travel-movie.html If I make a robot.txt file that disallow: /thenewagetraveler/ it will work or not? Or is going to not index at all www.thenewagetraveler.com ??? Please help and ideas
If you make a robot.txt that will disallow /thenewagetraveler/ and you place this robot.txt in public_html then the crawlers of your first site (which is located in public_html) will not index the files from your second website, located in that folder... BUT the crawlers that will index your second site, will access that site via domain, so they will only see the content of the folder, meaning they will not see the robot.txt from your public_html, meaning they will index your second site as they should. So to put it in simple words, yes you can make a robot.txt to disallow /thenewagetraveler/ and it will work if you place it in public_html and nowhere else.