Extract Data to Excel _ Urgent Help Needed!

Discussion in 'Programming' started by nVus, Aug 3, 2009.

  1. #1
    Hello,

    I have an excel file of over 2,000,000 links that are all from the same website that needs to be extracted for data.

    For example the link is;

    http://www.xyz.com/person_name?pub?=________

    The websites are all uniform and I just need 4-5 pieces of information from each one.

    Than it needs to goto an excel file for organization.



    Al
     
    Last edited: Aug 3, 2009
    nVus, Aug 3, 2009 IP
  2. GreenWithEnvy

    GreenWithEnvy Active Member

    Messages:
    125
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    53
    #2
    I'm not sure you will find someone to do that for free : \ Best of luck though!
     
    GreenWithEnvy, Aug 3, 2009 IP
  3. nVus

    nVus Peon

    Messages:
    86
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #3
    Well I know of software called AutomationAnywhere and OutWit and some excel commands but cannot put them to use.

    I am sure someone on this forum knows how without having to do the tedious work of copy-paste.

    Nevertheless, I appreciate your comment

    Al
     
    nVus, Aug 3, 2009 IP
  4. premiumscripts

    premiumscripts Peon

    Messages:
    1,062
    Likes Received:
    48
    Best Answers:
    0
    Trophy Points:
    0
    #4
    You need to give more info, a sample input and output file, etc.
     
    premiumscripts, Aug 4, 2009 IP
  5. B.Williams

    B.Williams Active Member

    Messages:
    555
    Likes Received:
    5
    Best Answers:
    0
    Trophy Points:
    58
    #5
    I dont understand what you are trying to say. Elaborate more so we can get it. Thanks :)
     
    B.Williams, Aug 4, 2009 IP
  6. kdaly100

    kdaly100 Peon

    Messages:
    11
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #6
    If you want to initially separate the text that is between the slashes for a start into columns then do the following.

    1. Paste the URLs into the excel file
    2. They should then be one column of URLs
    3. Select Data ': Text to Columns
    4. Select Delimited
    5. Select Other as your delimiter and put in /
    6. Click Next and finish

    This should separate out the individual text in the URLs. Not sure if this is what you wanted but from the limited information it may be.

    Best of luck.
     
    kdaly100, Aug 4, 2009 IP
  7. blogstutalk

    blogstutalk Peon

    Messages:
    8
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #7
    yes, You can try the above method, luckily your problem get solved..

    stutalk.com
     
    blogstutalk, Aug 4, 2009 IP
  8. nixande

    nixande Well-Known Member

    Messages:
    85
    Likes Received:
    8
    Best Answers:
    0
    Trophy Points:
    128
    #8
    [i can't post links yet, so I stripped the http from the answer but you will need it in the search and replace]

    if they are all made in a specific way it is supereasy to get the data in excel.

    things to remember:
    - never ever ruin your original data. copy the data you want to work on to a second column or better sheet.
    - makes saves often with version numbers. the bigger the files the easier they crash.
    - get yourself a mentor who has knowledge in slicing and dicing of the data. it is really not complicated once you get the drift

    If this is your link (and I assume person name and ___ is what you are looking for)
    - copy all of your links xyz.com/person_name?pub?=________ into a second colum
    - copy xyz.com/
    - mark this new colum
    - use "search and replace" to search for xyz.com/ and replace it with nothing
    - select data ': text to columns, select delimited and use ? as your delimiter
    - search in the column with "pub?=________ " and s&r for pub?= again with nothing
    person name and ____ should now be in separate columns


    assuming that it is not as easy as you describe, have a look into the text functions to retrieve the position of a character (in your case the ?) and the 'extract part left, part right, part' functions.
    hth
     
    nixande, Aug 9, 2009 IP