Does anyone know if google will index my website if the text size of each page is about 500kb? The texts are unique and not already indexed on google. Maybe someone knows if there is some sort of official limit?
google usually index first few hundred kb (i read somewhere its 200-300 kb but can;t remember the source) of a site. thats why its important for a content based site not to put too much image or flash animation or anything that increases the page size and comes up ahead of the text content. if your site is large and you orry that all might nt get indexed, then create a site map. if you have image on the site, distribute evenly through out the site so that google has a chance to index as much text as possible
I don't believe there is a limit, I've seen some "large" pages indexed by Google, mostly stuff from college students. I'm sure some factors come into play on how much the web page will be indexed, what those factors are I don't know.
As ssandecki said, Google indexes very large pages. The size was limited back in the old days but now it doesn't matter, i've had a 780kb page and pasting text from the very last sentence on the page in to Google resulted in the page being found indicating it crawled every last character. Having unusually large page sizes starts to become a usability issue before it becomes a Googlebot issue.
It's not so much a question about how much they "can", than how much they "will" crawl. Usually they limit the crawl of a document. I can't confirm but remember to have read something about it a while ago and think they where discussing nº between 150 and 250 K where the spiders stop and leave the page. if your site is large and you worry that all might not get indexed, then create a site map. if you have image on the site, distribute evenly through out the site so that google has a chance to index as much text as possible