There is a school locator script at https://www.ocps.net/parents/pages/FindaSchool.aspx When we input any address it returns schools located in that locality. For example: Use these details and submit the form Street Number : 3902 Street Name : Bobolink Street Type : Lane City : Orlando It returns a row having three schools Elementary,Middle and High school On clicking the more button it takes to the respective school details So for each school I need the respective school names Audobon Elementary Glenridge Middle Winter Park High Please suggest some ways to achieve this functionality Thanks
Use cURL to submit the form and retrieve the page (look at CURLOPT_POST and CURLOPT_POSTFIELDS). Once you have the page, use a regular expression to scrape out the school names.
I used curl to submit the form and scrape the result at step 1. But in the second step that to click more button to see school names,curl gives error. here is my code: $url="https://www.ocps.net/parents/pages/findaschool.aspx"; //$logFileName = "ocfl_import_school_".date('Y-m-d').".log"; //$logFileHandle = fopen($logFileName, 'a'); echo $_SERVER['HTTP_USER_AGENT']; $fields = array( '__SPSCEditMenu' => 'true', 'MSOWebPartPage_PostbackSource' => '', 'MSOTlPn_SelectedWpId' => '', 'MSOTlPn_View' => '0', 'MSOTlPn_ShowSettings' => 'False', 'MSOGallery_SelectedLibrary' => '', 'MSOGallery_FilterString' => '', 'MSOTlPn_Button' => 'none', '__EVENTTARGET' => 'ctl00$m$g_5e6ff926_878b_4831_ae5a_37603a021d6e$gridview1$ctl02$CR_ELEM', '__EVENTARGUMENT' => '', '__REQUESTDIGEST' =>'0x0494115FE2800AB046FFA276752A8BCBACE244D0FC6E1AE75FC9567ABF3FAE0EB9462B6141C2EC6EFCE01B97DCE32A21B57BEAC400DCE9C2D15CE3FA746D8422,27 Jul 2010 12:23:10 -0000', 'MSOAuthoringConsole_FormContext' => '', 'MSOAC_EditDuringWorkflow' => '', 'MSOSPWebPartManager_DisplayModeName' => 'Browse', 'MSOWebPartPage_Shared' => '', 'MSOLayout_LayoutChanges' => '', 'MSOLayout_InDesignMode' => '', 'MSOSPWebPartManager_OldDisplayModeName' => 'Browse', 'MSOSPWebPartManager_StartWebPartEditingName' => 'false', 'ctl00$m$g_5e6ff926_878b_4831_ae5a_37603a021d6e$gridview1$ctl02$CR_ELEM' => 'more', '__EVENTVALIDATION' =>'/wEWCQLYg4XOCQLw9Z3WCwLO/Y/kBQLZ1b79AwKc2spsAoWBoqcFAqq6wIsCAoOEl/0OAv318ZIMas32fAcmrAMZA+eOERUqECA+YaQ', 'WPQ3streetNum' =>'', 'WPQ3streetName' =>'', 'WPQ3st_type' => 'all', 'WPQ3City' => 'orlando', '__VIEWSTATE' => '/wEPDwUBMA9kFgJmD2QWAgIBD2QWBAIBD2QWAgIHD2QWAmYPZBYCAgEPFgIeE1ByZXZpb3VzQ29udHJvbE1vZGULKYgBTWljcm9zb2Z0LlNoYXJlUG9pbnQuV2ViQ29udHJvbHMuU1BDb250cm9sTW9kZSwgTWljcm9zb2Z0LlNoYXJlUG9pbnQsIFZlcnNpb249MTIuMC4wLjAsIEN1bHR1cmU9bmV1dHJhbCwgUHVibGljS2V5VG9rZW49NzFlOWJjZTExMWU5NDI5YwFkAgMPZBYIAgIPZBYEBSZnXzdhMWJjMGVlX2Q4Y2VfNDUyMF85YTU0X2ViZGU3MjI2NjM1Mg8PFhAeBVRpdGxlBRdFbnRlciBZb3VyIEFkZHJlc3MgSGVyZR4LRGVzY3JpcHRpb24FN1VzZSB0byBjb25uZWN0IHNpbXBsZSBmb3JtIGNvbnRyb2xzIHRvIG90aGVyIFdlYiBQYXJ0cy4eCkNocm9tZVR5cGUCAh4HQ2xpY2tlZGceCURpcmVjdGlvbgsqKlN5c3RlbS5XZWIuVUkuV2ViQ29udHJvbHMuQ29udGVudERpcmVjdGlvbgAeBVdpZHRoHB4GSGVpZ2h0HB4EXyFTQgKAgwhkZAUmZ181ZTZmZjkyNl84NzhiXzQ4MzFfYWU1YV8zNzYwM2EwMjFkNmUPZBYEZg8PZA8PFCsABRYGHgROYW1lBQZzdHJlZXQeDERlZmF1bHRWYWx1ZQUKJUJPQk9MSU5LJR4OUGFyYW1ldGVyVmFsdWVkFgYfCQUIdHlwZWFkZHIfCgUCTE4fC2QWBh8JBQRjaXR5HwoFB09STEFORE8fC2QWBh8JBQtmcm9tYWRkcmVzcx8KBQQzOTAyHwtkFgYfCQUJdG9hZGRyZXNzHwoFBDM5MDIfC2QUKwEFAgMCAwIDAgMCA2RkAgIPPCsADQEADxYEHgtfIURhdGFCb3VuZGceC18hSXRlbUNvdW50AgFkFgJmD2QWBgIBD2QWFGYPDxYCHgRUZXh0BQUwMzYwMGRkAgEPDxYCHw4FBTA0MDk5ZGQCAg8PFgIfDgUBIGRkAgMPDxYCHw4FBiZuYnNwO2RkAgQPDxYCHw4FCEJPQk9MSU5LZGQCBQ8PFgIfDgUCTE5kZAIGDw8WAh8OBQdPUkxBTkRPZGQCBw9kFgJmDw8WAh4PQ29tbWFuZEFyZ3VtZW50BUc/c2Nob29sbnVtYmVyPTA1MzEmTnVtYmVyPTM5MDImU3RyZWV0PUJPQk9MSU5LJlR5cGVBZGRyPUxOJkNpdHk9T1JMQU5ET2RkAggPZBYCZg8PFgIfDwVHP3NjaG9vbG51bWJlcj0wNTcxJk51bWJlcj0zOTAyJlN0cmVldD1CT0JPTElOSyZUeXBlQWRkcj1MTiZDaXR5PU9STEFORE9kZAIJD2QWAmYPDxYCHw8FRz9zY2hvb2xudW1iZXI9MTQxMSZOdW1iZXI9MzkwMiZTdHJlZXQ9Qk9CT0xJTksmVHlwZUFkZHI9TE4mQ2l0eT1PUkxBTkRPZGQCAg8PFgIeB1Zpc2libGVoZGQCAw8PFgIfEGhkZAIID2QWAgIBD2QWAmYPD2QWAh4FY2xhc3MFGG1zLXNidGFibGUgbXMtc2J0YWJsZS1leGQCCg9kFgICAQ9kFgQCAQ9kFgICAQ8WAh8QaBYCZg9kFgQCAg9kFgYCAQ8WAh8QaGQCAw8WAh8QaGQCBQ8WAh8QaGQCAw8PFgIeCUFjY2Vzc0tleQUBL2RkAgMPZBYCAgEPDxYCHxBoZBYEAgEPDxYCHxBoZGQCAw8PFgIfEGhkFgICAQ8PFgIfEGdkFgQCAQ8PFgIfEGhkFhwCAQ8PFgIfEGhkZAIDDxYCHxBoZAIFDw8WAh8QaGRkAgcPFgIfEGhkAgkPDxYCHxBoZGQCCw8PFgIfEGhkZAINDw8WAh8QaGRkAg8PDxYEHgdFbmFibGVkaB8QaGRkAhEPDxYCHxBoZGQCEw8PFgQfE2gfEGhkZAIVDw8WAh8QaGRkAhcPFgIfEGhkAhkPFgIfEGhkAhsPDxYCHxBnZGQCAw8PFgIfEGdkFgYCAQ8PFgIfEGdkZAIDDw8WAh8QZ2RkAgUPDxYCHxBnZGQCMg9kFgICAQ9kFgJmDw8WAh8QaGRkGAIFOGN0bDAwJG0kZ181ZTZmZjkyNl84NzhiXzQ4MzFfYWU1YV8zNzYwM2EwMjFkNmUkZ3JpZHZpZXcxDzwrAAoBCAIBZAUVY3RsMDAkUXVpY2tMYXVuY2hNZW51Dw9kBQdTY2hvb2xzZOGvfu3GsFV2Wbtmxm+7ozp0VnMG'); $fields_string = ""; $count = 0; foreach($fields as $key=>$value) { if ($count > 0 ) $fields_string .= '&'; $fields_string .= $key.'='.$value; $count++; } echo $fields_string; $ch = curl_init(); curl_setopt($ch,CURLOPT_URL,$url); curl_setopt($ch,CURLOPT_POST,count($fields)); curl_setopt($ch,CURLOPT_POSTFIELDS,$fields_string); curl_setopt($ch, CURLOPT_USERAGENT,"Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.8) Gecko/20100722 Firefox/3.6.8"); curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); curl_setopt ($ch, CURLOPT_SSL_VERIFYHOST, 0); curl_setopt ($ch, CURLOPT_SSL_VERIFYPEER, 0); $result = curl_exec($ch); curl_close($ch); echo $result; Please have look at it and suggest what is the issues? Thanks
Looks like encrypted sessions, you will need to fetch the data first as raw html, then parse in the VIEWSTATE, and possibly some other fields, as they are random each time, and cannot be hard coded.
Yes I downloaded the html source and then create the curl field values..When post the html page it submits but curl doesnt submits successfully. Have another question: Do you have any idea how to add regular expression for curl url parameters