1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

Is web scraping Legal?

Discussion in 'Legal Issues' started by idmindia, Aug 16, 2011.

  1. #1
    Is web scraping, web crawling or data extraction from websites, online store and travel portal is Legal in Internet ?

    Thanks
     
    idmindia, Aug 16, 2011 IP
  2. Chuckun

    Chuckun Well-Known Member

    Messages:
    1,161
    Likes Received:
    60
    Best Answers:
    2
    Trophy Points:
    150
    #2
    I am not a lawyer but I have a general understanding I think..

    If the content is free for public use and not copyrighted then by all means scrape it.. but anything you scrape which is the sole property of an individual or company then you could face prosecution.

    Common sense really.
     
    Chuckun, Aug 16, 2011 IP
  3. contentboss

    contentboss Peon

    Messages:
    3,241
    Likes Received:
    54
    Best Answers:
    0
    Trophy Points:
    0
    #3
    IT's the use, not the scraping, which is the problem. Doesn't seem to bother Google, though. Biggest scraper in the world.
     
    contentboss, Aug 17, 2011 IP
  4. attorney jaffe

    attorney jaffe Member

    Messages:
    241
    Likes Received:
    12
    Best Answers:
    0
    Trophy Points:
    45
    #4
    A copyright is the exclusive property right granted to the author or creator of original works including literary, dramatic, musical, artistic, and certain other intellectual works.

    Despite what some people think -- if it is posted on the Internet it already belongs to somebody.

    The way in which copyright protection is secured is frequently misunderstood. No publication or registration or other action in the Copyright Office is required to secure copyright. Copyright is Secured Automatically upon Creation!

    Therefore, scraping and re-posting something on the Internet violates the copyrights of the original creator.
     
    attorney jaffe, Aug 17, 2011 IP
  5. browntwn

    browntwn Illustrious Member

    Messages:
    8,347
    Likes Received:
    848
    Best Answers:
    7
    Trophy Points:
    435
    #5
    All of that being said, I think scraping is perfectly legal.
     
    browntwn, Aug 17, 2011 IP
  6. contentboss

    contentboss Peon

    Messages:
    3,241
    Likes Received:
    54
    Best Answers:
    0
    Trophy Points:
    0
    #6
    Every time you look at a page, your browser 'scrapes' it. 
     
    contentboss, Aug 18, 2011 IP
  7. pigpromoter

    pigpromoter Well-Known Member

    Messages:
    542
    Likes Received:
    13
    Best Answers:
    1
    Trophy Points:
    165
    #7
    This holds for websites operated by US residents, laws in other countries may differ!

    ... unless the content is published under a "permissive" license
     
    pigpromoter, Aug 18, 2011 IP
  8. Cucumba123

    Cucumba123 Well-Known Member

    Messages:
    1,403
    Likes Received:
    34
    Best Answers:
    3
    Trophy Points:
    150
    #8
    It's legal to scrape any webpage for personal use, as in offline browsing. If you re-publish copyrighted material online, then you'll be in trouble, sooner or later the original publisher will get to know about this.
    And then what's the use of this? Your website will get de-indexed because of duplicate content.

    I personally sometimes scrape valuable information so I can read it later in case the website goes down.
     
    Cucumba123, Aug 18, 2011 IP
  9. Matthias

    Matthias Member

    Messages:
    88
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    48
    #9
    This really is, in many ways, a gray area, because of the proliferation of RSS. Syndicating content through RSS feeds is dramatically changing the landscape of content publication and re-publication. Most bloggers I know are quite torn over the issue of a full RSS feed vs a partial one just because of web scraping.

    For those that put our a full RSS feed, you are literally giving your content is a nice, neat little bundle just waiting to be lifted. The partial feeds, while protective of the content, tend to be very detractive to people that rely heavily on feed readers to aggregate large volumes of content.

    I personally believe using the title and short description with a proper link back to the original site is fine. The important factor is that there must be a clear disclosure that the content is used in such a way as not to imply ownership or copyrights. I use this approach with my NewsHound (see sig for link) ticker where I use the title. I believe as long as you make every reasonable effort to NOT infringe on the copyright holder, you shouldn't have any problems. There is also a "fair use" clause in the DMCA where you are allowed to quote a certain percentage of the original content with a proper link.

    Here is an online news paper that uses this technique:

    http://www.israelherald.com/

    Notice that always provide the link to the original source and don't claim ownership of the external content.
     
    Matthias, Aug 18, 2011 IP
  10. browntwn

    browntwn Illustrious Member

    Messages:
    8,347
    Likes Received:
    848
    Best Answers:
    7
    Trophy Points:
    435
    #10
    The question is about scraping, not re-publication.
     
    browntwn, Aug 19, 2011 IP
  11. Blue Star Ent.

    Blue Star Ent. Well-Known Member

    Messages:
    1,989
    Likes Received:
    31
    Best Answers:
    0
    Trophy Points:
    160
    #11

    That is what I was thinking...yahoo, google, msn and plenty more are showing excerpts of people's sites on their own sites. In a day when large companies are patenting seeds of all things, ( Monsanto ) what is to be expected with your copyright notice? Do they need our permission to show our data on their computers?
     
    Blue Star Ent., Aug 19, 2011 IP
  12. Blue Star Ent.

    Blue Star Ent. Well-Known Member

    Messages:
    1,989
    Likes Received:
    31
    Best Answers:
    0
    Trophy Points:
    160
    #12
    The first is done by a machine controlled by a human. The second is done with a machine controlled by a human.


    Boy, this gets complicated... :eek: what is the basic difference between a robot arriving to a site and ..... a person accessing the same site and copying and pasting what they see. Both are under control of a human. I feel like I am opening a can of worms here.
     
    Blue Star Ent., Aug 19, 2011 IP
  13. Matthias

    Matthias Member

    Messages:
    88
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    48
    #13
    In terms of a splog or autoblog, they can be one in the same. What does the scrapper do with the content they just scrapped? Re-publish it. Search engines only publish meta-data, which as defined by the w3c, is not scrapping.
     
    Matthias, Aug 19, 2011 IP
  14. Matthias

    Matthias Member

    Messages:
    88
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    48
    #14
    Its a mess. The issue entirely revolves around what is going to be done with the scrapped content. what percentage of the content is going to be used, and whether proper citation of the content will be given. The answers to these questions really define the pivotal legal issue of infringement while considering intent with or without malice.
     
    Matthias, Aug 19, 2011 IP
  15. browntwn

    browntwn Illustrious Member

    Messages:
    8,347
    Likes Received:
    848
    Best Answers:
    7
    Trophy Points:
    435
    #15
    They are not the same. I can think of many uses for scraped data that do not involve re-publication.
     
    browntwn, Aug 19, 2011 IP
  16. Blue Star Ent.

    Blue Star Ent. Well-Known Member

    Messages:
    1,989
    Likes Received:
    31
    Best Answers:
    0
    Trophy Points:
    160
    #16

    Proving malice means having to go before a court, which makes lawyers lots of money... I am not one, but it looks like that is a self-serving circle.
     
    Blue Star Ent., Aug 19, 2011 IP
  17. Matthias

    Matthias Member

    Messages:
    88
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    48
    #17
    For the lawyers, indeed. Also added to this mess is that the DMCA automatically favor the content rights holder in most cases.

    Here are a few links that only add to the murkiness:

    http://www.plagiarismtoday.com/2006/08/29/why-rss-scraping-isnt-ok/

    http://www.plagiarismtoday.com/2007/09/20/the-dmca-on-7-search-engines/

    http://newmedialaw.proskauer.com/tags/dmca/
     
    Last edited: Aug 19, 2011
    Matthias, Aug 19, 2011 IP
  18. ab420

    ab420 Well-Known Member

    Messages:
    417
    Likes Received:
    6
    Best Answers:
    0
    Trophy Points:
    110
    #18
    Scraping public content is fine... however certain ways of using that content can certainly violate a wide array of laws.
     
    ab420, Aug 19, 2011 IP
  19. Matthias

    Matthias Member

    Messages:
    88
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    48
    #19
    You just brought up one of the biggest problems with the DMCA, that being what constitutes public content. Fair use advocates have long contested that if the copyright is not explicitly displayed it is public because the nature of the internet is public unless the content has some form of authorization protect such as a login screen or authorization method.

    Cases have been won in favor of the DMCA supporting an implied copyrights even though the content is freelyaccessable through open public means and does not display an explicit copyright.

    Here are a couple links that go into this issue in more detail:

    http://www.templetons.com/brad/copymyths.html

    http://www.amerindianarts.us/articles/digital_millennium_copyright_act.shtml
     
    Matthias, Aug 19, 2011 IP
  20. ab420

    ab420 Well-Known Member

    Messages:
    417
    Likes Received:
    6
    Best Answers:
    0
    Trophy Points:
    110
    #20
    Technically, scraping is just getting the information, it has nothing to do with what you do with that information. So, copyright laws do not come into play at all, UNTIL you attempt to reuse or republish that information. If they are displaying it publicly on their website, you can use a scraping program to download it all to your computer for your own viewing pleasure, and there is no violation of any laws at all.
     
    ab420, Aug 19, 2011 IP