Hi, I would appreciate any help anyone could give me on this subject. I need to create a script that does: 1. Upload files (allowed formats): PDF, Word Doc, RTF, JPG, GIF, BMP 2. Validate that the file format uploaded corresponds to the ones allowed 3. Search inside PDF, Word Doc and RTF files. I'm ok with uploading the files, but I don't know how to validate their formats and how to do a search inside a PDF, DOC or RTF file. Cheers, R
You'll need to store the allowed mime types, and compare the file data to those. $allowedMimes = array("application/msword", "application/pdf", "application/rtf"); if(!in_array($fileMime, $allowedMimes) { // fail } else { // do something } PHP: http://www.webmaster-toolkit.com/mime-types.shtml For searching, I think you'll need a PDF and DOC library. But, I can't say for sure
To search inside PDF files on unix/linux/mac systems you can use pdftotext which will extract the text. The program catdoc will do the same for most Word files. Both of these may need to be installed as they are normally not part of the standard software set. If you are on shared hosting then perhaps they are already there. % hostname -f yoohoo.dreamhost.com % which pdftotext /usr/bin/pdftotext % which catdoc catdoc not found Code (markup):
function checkExtension($extension) { if($extension = "doc" || $extension = "pdf" || $extension = "rtf" || $extension = "jpeg" || $extension = "jpg" || $extension = "bmp" || $extension = "png" || $extension = "gif") { $error = 0; } else { $error = 1; return $error; } } function getExtension($fileName) { $ext_arr = explode('.',$fileName); $cnt = count($ext_arr); $extension = $ext_arr[$cnt-1]; return $extension; } if(is_uploaded_file($_FILES['fileField']['tmp_name'])) { copy($_FILES['fileField']['tmp_name'],$uploaddir.'/'.$random.$_FILES['fileField']['name']); $files[]=$_FILES['fileField']['name']; $fileName = $_FILES['fileField']['name']; $orgDir=$uploaddir.'/'; $extension=getExtension($fileName); $checkExtension = checkExtension($extension); if($checkExtension != 0) { echo("<script>alert('Invalid file format')</script>"); echo("<script>location.href='redirectlocation.php'</script>");///////redirects to the place you want } } PHP:
mime_content_type is deprecated but points to fileinfo. One of the two should work for you. http://us3.php.net/manual/en/function.mime-content-type.php that will tell you type (might be a touch better than just going off the ext). Here's something about PHP and RTF . http://www.tuxradar.com/practicalphp/11/3/0