Data Collection Program

Discussion in 'Programming' started by Earl Sargent, Jan 18, 2014.

  1. #1
    Hey guys, I have experience with programming in C and Java and I need a program that collects information from the NYSE for stock analysis for personal use. Everything that I would need is on Balance Sheets, Income Statements and Cash Flow Documents. How would I begin coding this program? I am totally lost. I've been doing web design, so I haven't actually written a program for a while.
     
    Earl Sargent, Jan 18, 2014 IP
  2. 3x3cut0r

    3x3cut0r Member

    Messages:
    9
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    31
    #2
    Hi, just describe, what would you do manually step by step and then it is easier to automate it. If you just copy&pase some data from web to XLS file, this can be done in some scripting language, such as VBS
     
    3x3cut0r, Jan 18, 2014 IP
  3. Earl Sargent

    Earl Sargent Greenhorn

    Messages:
    29
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    11
    #3
    I would like to go to the NYSE database or through Yahoo! finance and I would like to get hundreds or thousands of stocks, and specific data for each of them(Cash Flow Sheets, Balance Sheet and Quarterly Reports), and copy them to an XLS file. That's it. I may have to run some simple math on some of the numbers also. It sounds like what you just described to me. What is VBS?
     
    Earl Sargent, Jan 18, 2014 IP
  4. 3x3cut0r

    3x3cut0r Member

    Messages:
    9
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    31
    #4
    I'd really need more details. I found some quarterly report, it's a html where a text and tables with numbers take turns. You probably need only those numbers. What you need is to have direct links to these reports, download and parse it and finally copy into excel in some format. VBS is a scripting language created by Microsoft with Visual Basic syntax, but with some simplifications. There are other and much better scripting languages, but I know this one best as I'm forced to use it at my work.
     
    3x3cut0r, Jan 19, 2014 IP
  5. Earl Sargent

    Earl Sargent Greenhorn

    Messages:
    29
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    11
    #5
    I'll try to be as detailed as possible. For example, I would like data for the company EMC (I will attach links below). I would go on Yahoo finance and get the info from the balance sheets, income statements and cash flow data. I would like to get all of this info and place it into an XLS file.

    http://finance.yahoo.com/q/is?s=EMC+Income+Statement&annual
    http://finance.yahoo.com/q/bs?s=EMC+Balance+Sheet&annual
    http://finance.yahoo.com/q/cf?s=EMC+Cash+Flow&annual

    I would like to do this, for hundreds of stocks at random. Can I still do this with VBS code?
     
    Earl Sargent, Jan 19, 2014 IP
  6. 3x3cut0r

    3x3cut0r Member

    Messages:
    9
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    31
    #6
    Yes, that is possible. As an input, you would need list of all companies and then you just generate the links like "http://finance.yahoo.com/q/is?s=" & companyFromList & "+Income+Statement&annual" or sth like that and do this for the other two links as well. Then it's only about some HTTP requests, parsing the response and saving into XLS. But what should the XLS look like? Special sheet for every company (multiplied by 3 as you have three links for every company)? Or special XLS for every company? That's imho the most important part to think through, because the better the XLS will look like, the less work you'll have afterwards with calculations you need to do.
     
    3x3cut0r, Jan 19, 2014 IP
  7. Earl Sargent

    Earl Sargent Greenhorn

    Messages:
    29
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    11
    #7
    It sounds like a special XLS for every company would be better for what I would like to do.
     
    Earl Sargent, Jan 19, 2014 IP
  8. 3x3cut0r

    3x3cut0r Member

    Messages:
    9
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    31
    #8
    So if you like to do it on your own, search for terms like "msxml https vbscript", then sth like "excel vbscript" and you definitely need either "instr" function for parsing the HTML, or better a
    Microsoft.XMLDOM object to parse it as XML and in the end, you should be able to do it. Or I can do it for you, it could take me a few hours, 5 at most imho.
     
    3x3cut0r, Jan 19, 2014 IP
  9. Earl Sargent

    Earl Sargent Greenhorn

    Messages:
    29
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    11
    #9
    I would like to try it, but I need it ASAP so I don't have time to experiment with it. I would like if you could do it for me. Also, I looked up VBS programming and it would be difficult for me to do this because I am using a Mac and you need to go through the Run program on Windows. Would you like my email to send the final program to?
     
    Earl Sargent, Jan 19, 2014 IP
  10. 3x3cut0r

    3x3cut0r Member

    Messages:
    9
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    31
    #10
    Wait and have you dualboot? Since you need windows to run the script or at least to run it somewhere on windows and download the excel files afterwards. On mac, you should probably use applescript, perl, or python. I can't help you with applescript, but I could do it in python, but it wouldn't be ASAP at all as I'd have to learn Python first :)
     
    3x3cut0r, Jan 19, 2014 IP
  11. Earl Sargent

    Earl Sargent Greenhorn

    Messages:
    29
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    11
    #11
    The program would be for use on a Windows computer. I don't have dualboot, but I was planning on running it on a friends computer. We are both collaborating to analyze the stock market together.
     
    Earl Sargent, Jan 19, 2014 IP
  12. aabbccli

    aabbccli Member

    Messages:
    1
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    36
    #12
    you can use Httpclient and jsoup to collect data.
     
    aabbccli, Mar 17, 2014 IP