1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

Plagiarism checking software

Discussion in 'Programming' started by axiaer, Apr 15, 2010.

  1. #1
    Hey all, i was looking to develop a plagiarism checking software. Can any one give some basic idea how the functionality could be. Besides that can you recommend any online website to check plagiarism?
     
    axiaer, Apr 15, 2010 IP
  2. bratosab

    bratosab Active Member

    Messages:
    324
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    53
    #2
    Hello,
    I don't have an idea of how it works but an example is copyscape(dot)com

    Best Regards.
     
    bratosab, Apr 16, 2010 IP
  3. ccoonen

    ccoonen Well-Known Member

    Messages:
    1,606
    Likes Received:
    71
    Best Answers:
    0
    Trophy Points:
    160
    #3
    First, i would scrape the originals site content (strip the tags), then take random snppets from the content, like 8-12 words or so, maybe 3 or 4 blocks of these random snippets. Then do an exact search on google/yahoo/bing for each one. If matches come back, it probably is stolen content.
     
    ccoonen, Apr 16, 2010 IP
  4. viron86

    viron86 Active Member

    Messages:
    426
    Likes Received:
    6
    Best Answers:
    0
    Trophy Points:
    60
    #4
    ya you need to depend on search engine for the result. You need to search for the give string in all major search engine if any one of the search engine show the result then that mean that article is stolen else it unique.
    But this totally depends if the search engine has index the article.
     
    viron86, Apr 18, 2010 IP
  5. axiaer

    axiaer Peon

    Messages:
    70
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Thanks a lot yes copyscape is the best free web for this, and there are plenty of other free software, but dont you think if we brake the whole document of lets say 1000 words in to 100 queries of 10 words and we do this process for 20 articles, can you imagine the queries sent to google or other search engine... whats the possibility of getting blocked, since google is highly sensitive in this regard....
     
    axiaer, Apr 18, 2010 IP
  6. n3r0x

    n3r0x Well-Known Member

    Messages:
    257
    Likes Received:
    4
    Best Answers:
    1
    Trophy Points:
    120
    #6
    Simple just use different proxy servers and emulate different webbrowsers (new browser & proxy each query)...
     
    n3r0x, Apr 26, 2010 IP
  7. Dark3n

    Dark3n Peon

    Messages:
    22
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #7
    Looks pretty neat i need to get this. :)
     
    Dark3n, Apr 26, 2010 IP