If its your own server, find php.ini (or create in root if doesnt exist) - and insert the following: disable_functions =file_get_contents Code (markup): If its a remote server/site which is scraping your content, you can try disabling/manipulating their HTTP_REFERER via php or htaccess, - and echo a message/or redirect if the HTTP_REFERER matches the site which is scraping your content, however this method is unreliable as cURL can easily manipulate the HTTP_REFERER... Take alook at the following: http://www.javascriptkit.com/howto/htaccess14.shtml
If you know the IP address of their site (find it by pinging the site or logging access attempts), you can block access using their web site's IP address using $_SERVER['REMOTE_ADDR']. But $_SERVER['REMOTE_ADDR'] is not always reliable. There are other locations that may exist with the client's true IP address: - $_SERVER['HTTP_CLIENT_IP'] - $_SERVER['HTTP_X_FORWARDED_FOR'] - $_SERVER['HTTP_PC_REMOTE_ADDR'] function get_ip_address() { static $ip_address; if ( NULL === $ip_address ) { // where are we looking? $locations = array( 'HTTP_CLIENT_IP', 'REMOTE_ADDR', 'HTTP_X_FORWARDED_FOR', 'HTTP_PC_REMOTE_ADDR', ); // search for the IP address foreach ( $locations as $location ) { if ( isset( $_SERVER[ $location ] ) ) { $ip_address = $_SERVER[ $location ]; break; } } // may contain multiple addresses if ( FALSE !== strpos( $ip_address, ',' ) ) { $ip_address = explode( ',', $ip_address ); $ip_address = end( $ip_address ); } $ip_address = trim( $ip_address ); // match the ip address against an IPv4 pattern matcher if ( ! preg_match( '/^\d{1,3}(\.\d{1,3}){3}$/', $ip_address ) ) { $ip_address = '0.0.0.0'; } } return $ip_address; } PHP: Usage: $banned_sites = array( '123.45.67.890', '0.0.0.0', ); if ( in_array( get_ip_address(), $banned_sites ) ) { exit( 'Inappropriate access detected' ); } PHP:
As soon as they detect that they're blocked they'll start using a proxy, assuming they're not routing requests through random proxies already. Just try and detect them, then slip a couple of in-content links back to yourself into the pages you return to the scrapers. Blocking scrapers always turns out to be a losing battle in the long run. If you can find ways to take advantage of them you'll do much better.
Have a look at the HTML obfuscator care of IonCube: located here. I have no idea how they do it but the page does render with no discernible source - see their example.
Attempting to obfuscate HTML is: a) futile b) a great way to get horrible placement in search engines
Well as stated before trying to block scrapers is always a losing battle especially since you run the risk of blocking search engines. Instead you can either find advantages in the small window of opportunity ie placing links in the text, copyrights and keywords back to your site or even cut the article and place a Rest Of Article... link at the end of the piece.