Hi, I am using Snoopy class to read contents of a web page. But it doesn't perfectly simulate the browser because the content of web page changes when I use Snoopy to fetch it. And if I view it in browser directly then content is different. I even set the user agent string in Snoopy but still no luck. How can we perfectly simulate browser so it is indistinguishable whether we are using some script or a browser to view web page? Thanks
What do mean by "different content"? Does it show nothing, or a warning advising you to stop using a bot? A possible explanation might be the fact that Snoopy cant execute JavaScript code on the site.
I believe Snoopys dependencies are cURL and/or fsockopen(). cURL is PHP's effort for browser simulation, theirfore thats the only way to simulate a browser using pure PHP. However you can take advantage of the options cURL can accept, such as the referer, ip (use a proxy), user agent to make it seem more realistic to the remote site.
Got it. It's happening because the site tries to set cookies and since Snoopy can't do it (or may be it does and I don't know!?) so it shows different content. I tested it by disabling COOKIES in my browser, then checked the site and now its content was the same as shown when using Snoopy. So how do I allow sites to set cookies using Snoopy?