1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

Buying Custom Web Scraper - Details Inside >>

Discussion in 'Programming' started by metros, Dec 17, 2016.

  1. #1
    I need a custom web scraper, let me describe as much as I can as of what I am looking for.
    What I need is the following:
    * Scraper target would be: Torrents, Downloads, Online Movie Watch, Online TV Watch - and similar type of Domains.
    I need * API Script to analyze URLS (from thousands random domains) and extract the Movie/TV Title then call TMDB/TVMaze/Mashape/iMDB/API's/database/other to get the title info.

    For example, the scraper API get a call from me for the following URL:

    For Movies ->

    URL Example: http://www.hdmovieswatch.net/skyfall-2012-full-movie-online/

    The scraper need to extract the title "Skyfall 2012"

    Then ->

    Get title info from tmdb/imdbapi/imdb/database/mashape/other & Detect if it is a Movie or TV

    Then ->

    To send back via GET/POST the following:

    Movie (Movie or TV), Movie ID, Title, Year, Language, Cast, Crew, Genres, Keywords, RunTime, Budget, Revenue, Movie Website, Release Information (Theater date, PG-13 etc), Rating (RottenTomatoes, iMDB, Metacritic), Recommended Movie Titles (TMDB/Other), Trailer YouTube URL, Poster URL (Mashape/TMDB), Titles in other Languages, Movie Summary, Time Stamp (When the data actually stored in our local db)

    For TV ->

    URL Example: http://onwatchseries.to/episode/game_of_thrones_s3_e6.html

    Then ->

    To get title info from tmdb/tvmaze/db/mashape/other & Detect if it is a Movie or TV

    Then ->

    To send back via GET/POST the following:

    TV (Movie or TV), TV ID (imdb), Title, Year, Network (HBO/Netflix/Other), Type (Scripted/Reality/Talkshow/Other), Language, Status (If the show still running or not), Days (When the show aired - for example "Sunday"), Show country (US, UK, CA etc), Aired TimeZone (America/New_York), Season (In this example s3 = Season 3), Episode (In this example Episode 06), Cast, Crew, Genres, Keywords, RunTime (For the given Episode), TV Website, Release Information (PG-13 etc), Rating (RottenTomatoes, iMDB, Metacritic), Recommended TV Titles, Episode Trailer YouTube URL, Poster URL, Titles in other Languages, TV Summary, Time Stamp (When the data actually stored in our local db)

    *

    The URL is vary from domain to domain - so /skyfall-2012-full-movie-online on torrent X can be /watch-skyfall-2012-full-movie-online on Y and /watch/free-movie-skyfall-2012-full-online on Z

    The best Idea to do that in my opinion (and easiest way):
    Have a list of terms to exclude (and I will personally will add manually the terms so no need to worry about that), such as: full, movie, online, watch, free, etc and the remain term is pretty much the Movie or TV title it self.
    Notice that this tool should work with other languages as well, hence why I think it is the best to do that by excluding terms.

    Also, it should be a waterfall type of scraper, as the fastest/easiest way to grab the title (URL else iMDB link inside the page else H1 etc etc).

    The tool should send results back as fast as possible once sending the URL.

    *

    The tool should communicate & have it's own database (MySQL/MongoDB/whatever) and store the results there because once the tool detect a Movie/TV from the given URL it should store it in database so the tool can check first if it is existed in database and if not then via the API. The idea behind it is simple, to avoid unlimited/tons of calls for the same title on imdb/tmdb/tvmaze over and over and to ensure future lookups for the same content faster.

    *

    In addition - the tool should be able to proccess Titles as well and not just URL's

    *

    So bottom line is that I need to send calls with URL/TITLE and I get back all TITLE details as described.

    *

    All responses must be valid JSON

    *

    Send me a quote via PM

    Cheers!
     
    metros, Dec 17, 2016 IP
  2. Ovidiu20

    Ovidiu20 Active Member

    Messages:
    44
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    63
    As Seller:
    100% - 0
    As Buyer:
    0.0% - 0
    #2
    Hello,

    Is this job still available?
     
    Ovidiu20, Dec 19, 2016 IP
  3. metros

    metros Notable Member

    Messages:
    3,978
    Likes Received:
    373
    Best Answers:
    0
    Trophy Points:
    245
    As Seller:
    100% - 0
    As Buyer:
    100% - 0
    #3
    Yes
     
    metros, Dec 20, 2016 IP
  4. fastlinks

    fastlinks Greenhorn

    Messages:
    8
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    21
    As Seller:
    100% - 0
    As Buyer:
    100% - 0
    #4
    I can scrape, extract and post anything you need, but you have to show me the process manually.

    i can do it with ubot+winautomation

    if you still need this, add me for discussion:

    skype: ipowerhost2
     
    fastlinks, Jan 7, 2017 IP
  5. EdByrnee

    EdByrnee Peon

    Messages:
    5
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    3
    As Seller:
    100% - 0
    As Buyer:
    100% - 0
    #5
    Hi, is this job still available? I have some experience writing web scrapers with APIs.
     
    EdByrnee, Jan 30, 2017 IP
  6. somasounds

    somasounds Member

    Messages:
    8
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    31
    As Seller:
    100% - 0
    As Buyer:
    100% - 0
    #6
    This sounds like an awesome job I would love to do. If anyone is looking for this kind of work, please PM me -- scraping and automation is what I do!
     
    somasounds, Feb 2, 2017 IP