Someone auto-scraping my content. Any solution?

Istvan Well-Known Member

Messages:: 1,544

Likes Received:: 43

Best Answers:: 0

Trophy Points:: 175

#1

Hi DPers,

Today I have found that one site is scraping my content and reusing it on his website. From what I have seen he use file_get_contents to parse the pages and a javascript to replace my url with his internal urls.

Is there a way through .htaccess or robots I can avoid that domain to parse my content?

Thanks

Istvan, Nov 12, 2009 IP

tolra Active Member

Messages:: 515

Likes Received:: 36

Best Answers:: 1

Trophy Points:: 80

#2

I assume he's always using the same IP to scrape from, therefore in your .htaccess file:

<Limit GET PUT POST>
order allow,deny
deny from 1.2.3.4
allow from all
</Limit>

Change 1.2.3.4 to his IP address that should block him from your site at least until he changes the IP he uses to scrape.

tolra, Nov 12, 2009 IP

Bohra Prominent Member

Messages:: 12,573

Likes Received:: 537

Best Answers:: 0

Trophy Points:: 310

#3

Just find out the hosts ip he is using and block it that way his server cant connect to yours

Bohra, Nov 13, 2009 IP

Istvan likes this.

Istvan Well-Known Member

Messages:: 1,544

Likes Received:: 43

Best Answers:: 0

Trophy Points:: 175

#4

Thanks, I have blocked the server ip in htaccess and seems ok now.

Istvan, Nov 14, 2009 IP

ravee1981 Active Member

Messages:: 712

Likes Received:: 8

Best Answers:: 0

Trophy Points:: 60

#5

tolra said: ↑

I assume he's always using the same IP to scrape from, therefore in your .htaccess file:

<Limit GET PUT POST>
order allow,deny
deny from 1.2.3.4
allow from all
</Limit>

Change 1.2.3.4 to his IP address that should block him from your site at least until he changes the IP he uses to scrape.
Click to expand...

i have that code on all my sites with the ip series of all popular datacenters. These scrapings come only from servers and not from a machine on a home pc.

You can also write to the abuse email of the domain, usually found in the whois and inform about the scraping. either the content will be removed or the domain will get banned

ravee1981, Nov 16, 2009 IP

Log in or Sign up

Someone auto-scraping my content. Any solution?

Istvan Well-Known Member

tolra Active Member

Bohra Prominent Member

Istvan Well-Known Member

ravee1981 Active Member

Useful Searches