Content Scrappers

Discussion in 'Services' started by incdeveloper, Mar 18, 2011.

  1. #1
    If you need to extract content from this or that website, I may help you.

    I have a java-based spider running on top of several ec2 instances with many threads looking for content on the web. Yesterday I processed 75,000 pages with 1 ec2 instance (1gb of ram, 2 cores), and I can have as many as I want because of the app's architecture.

    I am currently processing big data from big websites like twitter, linkedin and others...

    PM me if you have questions.
    ;)
     
    incdeveloper, Mar 18, 2011 IP
  2. Reputation

    Reputation Peon

    Messages:
    55
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    As Seller:
    100% - 0
    As Buyer:
    100% - 0
    #2
    what kind of format will it the data be in? and can it be customized to my needs? (i.e i need a export of all the products on a E-Cart website, of all the product picture names and prices in a Excel format) can you supply to such custom demands? more details will be nice.
     
    Reputation, Mar 18, 2011 IP
  3. incdeveloper

    incdeveloper Well-Known Member

    Messages:
    269
    Likes Received:
    9
    Best Answers:
    0
    Trophy Points:
    110
    As Seller:
    100% - 0
    As Buyer:
    100% - 0
    #3
    The output format is irrelevant because the data can be transformed to whatever you want...
    - XML
    - JSON
    - MYSQL
    - CSV
    - EXCEL

    I can process the info if you want. Like extract frequency, trends, and other cool things...

    thats not the hard part here... ;)
     
    incdeveloper, Mar 18, 2011 IP