PHP Generated HTML Pages, question!

red-sky Peon

Messages:: 89

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#1

I have just figured out how I am able to create my amazon associated website (don't worry, it's a php question) and came across this probelm.

If I use PHP to generate all product html pages needed (by using mod_rewrite, is this a good method of doing it?) based on parsing data etc for the user, I am assuming when google is crawling my website that they will encounter all of these PHP functions and land at the html pages like the user would, but, if google indexes these pages, can they be accessed by following the mod_rewrite html address? For example

www.mynewawssite.com/get-the-product-the-user-requests.php

Then mod_rewrite to rename the destination page:

www.mynewawssite.com/Product-Title.html

When google crawls this webpage ^^^ will it they index it? Also, if it is, will a google user be able to access it by clicking the link to www.mynewawssite.com/Product-Title.html?

If not, is there a php function needed to generate and create the webpage on my server? Also, if this is done, am i required to stop it from generating webpages if one already exists?

Thanks in advance for any advice given!

red-sky, Nov 20, 2008 IP

jestep Prominent Member

Messages:: 3,659

Likes Received:: 215

Best Answers:: 19

Trophy Points:: 330

#2

Google indexes php pages just the same as html pages.

Php is executed before any data is sent to a web browser (including google). Unless your url looks like a huge query string (?this=nvfruiovfrnuivngre&that=jfunwevbo8t4wivtoui&theother=jnviurbvyuiotrhnvitoe) there's no reason to use mod rewrite just to change the file extension.

Personally, I would just leave the pages with .php, and make sure they are named somethine relevant to the page itself, ex: mysite.com/computer-hardware.php

jestep, Nov 20, 2008 IP

joebert Well-Known Member

Messages:: 2,150

Likes Received:: 88

Best Answers:: 0

Trophy Points:: 145

#3

What I do is use a html_cache folder and a RewriteRule/RewriteCond to see if the REQUEST_URI exists in that cache folder. If it exists I rewrite the request to that HTML file transparently. If it doesn't exist I rewrite the uri to my PHP page which generates an HTML file and saves it to that cache folder for the next person before returning the generated page.
The browser/search-engine/etc never sees a *.php URL.

Here's an example of it in action. I just started on the store links this morning, but I've been using the technique for the wallpaper pages for weeks now.

// Edit -- Here's a debugging thread of mine about the RewriteCond when I first implemented it.

joebert, Nov 20, 2008 IP

red-sky Peon

Messages:: 89

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#4

jestep said: ↑

Google indexes php pages just the same as html pages.

Php is executed before any data is sent to a web browser (including google). Unless your url looks like a huge query string (?this=nvfruiovfrnuivngre&that=jfunwevbo8t4wivtoui&theother=jnviurbvyuiotrhnvitoe) there's no reason to use mod rewrite just to change the file extension.

Personally, I would just leave the pages with .php, and make sure they are named somethine relevant to the page itself, ex: mysite.com/computer-hardware.php
Click to expand...

But say for example I never created a webpage (file) in php or html, instead I used a call to Amazons api using PHP to return the data needed required there and then and then applied this data to a set template when the user wants to see this items page? So the user sees a html page generated by the results of the API call, will these pages be indexed by google, and if so, does PHP create the file, and how do you prevent it creating a new file each time, I know it's hard to understand what im talking about but its really difficult to explain

red-sky, Nov 20, 2008 IP

jestep Prominent Member

Messages:: 3,659

Likes Received:: 215

Best Answers:: 19

Trophy Points:: 330

#5

Yes, they will be indexed by google, but how many pages google will index depends a lot on the amount of links / trust that your site has. Not even static pages will fix this. Since there is potentially an infinite number of pages on your site (due to the nature of completely dynamic sites), google gets selective on how many they index. Make sure you have a relevant hierarchy of navigation such as:
index:
categories:
sub_categories:
products:

Also, make sure you aren't using session id's in your urls. As stated above, try to also use a relevant url schema. Using the product hierarchy above, if you can accomplish something like this it will really help as well.

mysite.com/category/sub_cat/product-name.php

jestep, Nov 20, 2008 IP

joebert Well-Known Member

Messages:: 2,150

Likes Received:: 88

Best Answers:: 0

Trophy Points:: 145

#6

People never understand my answer to this question. It's a shame too because it does exactly what they're fumbling around trying to explain and it does it well.

joebert, Nov 20, 2008 IP

eric90 Peon

Messages:: 14

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#7

Google will index it just fine, b/c to them and users there's really no difference, since mod_rewrite executes before the requested file is even loaded.

Examples of sites that uses mod_rewrite for this purpose is wordpress, and as far as i know, google indexes wordpress pages just fine, even though they are not really physical pages.

eric90, Nov 21, 2008 IP

red-sky Peon

Messages:: 89

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#8

eric90 said: ↑

Examples of sites that uses mod_rewrite for this purpose is wordpress, and as far as i know, google indexes wordpress pages just fine, even though they are not really physical pages.
Click to expand...

This is exaclty what I was talking about, my product pages will not be physical pages, at the minute im not worried if google will index them, but if they do and they index the non physical html page, how will this hyperlink lead to the product when there's no physical product page, will I be required to generate a physical page for each product I want indexed?

red-sky, Nov 21, 2008 IP

red-sky Peon

Messages:: 89

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#9

joebert said: ↑

What I do is use a html_cache folder and a RewriteRule/RewriteCond to see if the REQUEST_URI exists in that cache folder. If it exists I rewrite the request to that HTML file transparently. If it doesn't exist I rewrite the uri to my PHP page which generates an HTML file and saves it to that cache folder for the next person before returning the generated page.
The browser/search-engine/etc never sees a *.php URL.
Click to expand...

So do you mean you check if this webpage exists in the html_cache folder before loading anything, then if it does, load it from the html_cache folder, or if not, then generate the webpage and store it in the html_cache folder?

Does this html_cache folder hold all of the generated pages for a period of time? If so, what happens when a user finds your indexed html (not your PHP page to check if this page exists etc) webpage on google, clicks it to access it and it no longer exists in the html_cache?

red-sky, Nov 21, 2008 IP

eric90 Peon

Messages:: 14

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#10

red-sky said: ↑

This is exaclty what I was talking about, my product pages will not be physical pages, at the minute im not worried if google will index them, but if they do and they index the non physical html page, how will this hyperlink lead to the product when there's no physical product page, will I be required to generate a physical page for each product I want indexed?
Click to expand...

Yes, you do. What happens is mod_rewrite will change it into a format such as http://www.example.com/view/post/123

And so what happens is you must make your index page the "parser" your index page will first have to detect if the url is in that specific format, if not do something, maybe show a default page or output an error, or direct them to the main page so all url from there on wards will be formatted correctly.

If your parser detects that the url is in the correct format (you can do so using the php $SERVER_ variable. One of the $SERVER['REQUEST_] variables can be manipulated to show only what's after the base url. and after that, you will use explode() with the delimiter set to '/' ) And then you end up getting an array of "commands" which using my above example would be {"view", "post", "123"}

And then at this point, you just have your index page read the first array value and it see that's it's a "VIEW" command, that's trying to view a POST, and that the post id is 123. At that point, you just pull that info off your database.

And you're done.

eric90, Nov 21, 2008 IP

Log in or Sign up

PHP Generated HTML Pages, question!

red-sky Peon

jestep Prominent Member

joebert Well-Known Member

red-sky Peon

jestep Prominent Member

joebert Well-Known Member

eric90 Peon

red-sky Peon

red-sky Peon

eric90 Peon

Useful Searches