Shawn, GuyFromChicago, and whoever else may be reading... I do not have extensive knowledge of php, yet ;-) I do need to get a couple tools working, and I don't have really the darndest idea how to start. My goals: I am creating a website that will become a portal for wrestling related information... Articles, News, Shopping (obviously plugging JRWrestling.com as the froogle - if you will, and a directory of HS, College, and other related links. My problem is this: I want to draw news from 2 - 3 different resources (all of which will give me permission.) automatically without having to touch code. There is a google news script (that doesnt seem to work anymore for them since they sued the maker) if you want to see that for a core idea, please let me know.
Well the easiest way to do it would be if they give you some sort of standard feed (like RSS). There are lots of RSS feed scripts around (check hotscripts.com for example). - Shawn
Well, the problem there is... these guys run big sites, but don't have technical knowledge. I will need to pull it directly off of their site. The extention is .asp... I wonder if editing a link harvester would work.. Google doesnt offer RSS, I dont belive, but the code seems to pull the news from the pages still some how.
They probably are screen-scraping... Which means you would need to write a custom script for each site. - Shawn
Ok Ok.. Id be willing to pay for the creation of this, as well as a script that can create an Outlook file from osCommerce's customer list... Or you could create it and offer the code to the public free?
Sorry... not something I want to do. It's useless to the public because it would only work for the specific site. If you can't do it yourself, your best option would be to hire a programmer to do it for you. - Shawn
What are you trying to scrape? I could write something in VBScript for .asp that grabs stuff. For a sample of such a scrip see http://www.sfreader.com/scraper.asp enter a URL into the field; be sure to include http:// Example: http://www.yahoo.com The result will show you the html source for the page and the page itself. Basics: Goes out, grabs client side source, load into a variable, serves it up. The thing is that the content is now being served from YOUR server. Any spider hitting this page would see it as YOUR page, not the page you are scraping from. TampaDave
Im trying to scrape the following: Most notably: http://thewrestlingmall.com/htmls/news.asp?Cat=4 http://themat.com/frontNews/dynamic/top10.asp http://themat.com/pressbox/presslist.asp?catid=4 But wouldn't mind: http://thewrestlingmall.com/htmls/news.asp?Cat=2 http://thewrestlingmall.com/htmls/news.asp?Cat=3 http://thewrestlingmall.com/htmls/news.asp?Cat=5 http://themat.com/pressbox/presslist.asp?catid=49