Really long MySQL/PHP data filtering

Discussion in 'PHP' started by sadhakim, Feb 18, 2007.

  1. #1
    Okay, I have stock quotes of the NYSE for the last 8 years for every single stock.. so daily quotes (high, low, open, close etc).. I have a few other columns, dollar change, percentage change etc.. now what i wanna do is filter through this data.

    So lets say i wanna find out give me all instances where a stock went up 10% over 5 consecutive days.. Right now I'm querying the database and checking to see if the condition holds by iterating through the data, is there any better way of doing this, because i'm going through about 6 million numbers and so its taking a long long time in PHP (about 20 minutes actually, with all the additional calculation, indexing etc)
     
    sadhakim, Feb 18, 2007 IP
  2. picouli

    picouli Peon

    Messages:
    760
    Likes Received:
    89
    Best Answers:
    0
    Trophy Points:
    0
    #2
    Show us your code, maybe we can optimize it a bit...
     
    picouli, Feb 19, 2007 IP
  3. rays

    rays Active Member

    Messages:
    563
    Likes Received:
    7
    Best Answers:
    0
    Trophy Points:
    58
    #3
    there are few ways ...

    1. Normalization of database structure and optimization of queries, if there are relatively large amount of tables involved try and use specific kinda joins like left join, right join .. etc

    2. If you are having data in MySQL 5 database it will be good idea to use stored procedure as there are lot of advantages of using stored procedures over inline queries due to pre-compiled statements.

    3. If you know the search types which are very frequently getting used try and cache the results in diffrent tables or use XML.

    4. It will be a bit difficult and time consuming in early phase for tagging the results and storing them in diffrent tables ... as in case of stockes result from march 2001 - 2005 its not gonna change ..you can add one more column to table and tag them with the month year and also can cache them in XML's

    You also can come up with few more ideas for optimization, but it will depend on your system requirements and fisibility.
     
    rays, Feb 19, 2007 IP
  4. sadhakim

    sadhakim Peon

    Messages:
    7
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #4
    Here is my code: I know I call the single query function a bunch of times, which adds delay.. I know I should change the order of the if statement in the compute single function too.. this script takes about 20 minutes to run..

    
    
    
    function computeQuery($ex, $volume, $change, $period){
           $query = "select DISTINCT symbol from $ex order by symbol desc";
           $res = mysql_query($query);
           $numberOfRows = mysql_num_rows($res);
           for ($foo=0, $k=0; $foo<$numberOfRows; $foo++, $k++%4){
              $r = mysql_fetch_row($res);
              $symbol = $r[0];
              $criteria = computeSingle($symbol,$ex, $volume, $change, $period);
            if ($foo%2==0){
                    $return_this .="<tr><td class=seperate><a href=results.php?symbol=$symbol&exchange=$ex&change=$change&volume=$volume&period=$period>$symbol</a></td><td class=seperate>$criteria</td>";
                    }
                    else{
                    $return_this .="<td class=seperate><a href=results.php?symbol=$symbol&exchange=$ex&change=$change&volume=$volume&period=$period>$symbol</a></td><td class=seperate>$criteria</td></tr>";
                    }
           }
           return $return_this;
    }
    
    
    function computeSingle($symbol, $exchange, $volume, $change, $period){
    $query = "Select time,volume,percentage_change,avg_volume,williams,rsi from $exchange where symbol = '$symbol' order by time desc";
    $result = mysql_query($query);
    $numrows = mysql_num_rows($result);
    $dec = 0;
    $criteria = 0;
    $calc_percentageChange = 0;
    $calc_volume = 0;
    $calc_avgVolume = 0;
    for ($i=0, $j = $numrows - 90, $k=0; $i <$j && $i < $period; $i++, $k = ($k+1)%$period){
            $row = mysql_fetch_row($result);
            $res_time[$k] = $row[0];
            $res_volume[$k] = $row[1];
            $calc_volume += $row[1];
            $res_percentageChange[$k] = $row[2];
            $calc_percentageChange += $row[2];
            $res_avgVolume[$k] = $row[3];
            $calc_avgVolume += $row[3];
    }
    
    $calc_volume;
    $calc_avgVolume;
    if ($calc_volume >= $calc_avgVolume * $volume && $calc_percentageChange <= $change){
               $criteria++;
    }
    for ($i=0, $j = $numrows - 90, $k=0; $i <$j; $i++, $k = ($k+1)%$period){
                    $row = mysql_fetch_row($result);
                    $calc_volume += $row[1] - $res_volume[$k];
                    $calc_percentageChange += $row[2] - $res_percentageChange[$k];
                    $calc_avgVolume += $row[3] - $res_avgVolume[$k];
                    $calc_volumeP = $calc_volume;
                    $calc_avgVolumeP = $calc_avgVolume;
                    $res_time[$k] = $row[0];
                    $res_volume[$k] = $row[1];
                    $res_percentageChange[$k] = $row[2];
                    $res_avgVolume[$k] = $row[3];
                    if ($calc_volume >= $calc_avgVolume * $volume && $calc_percentageChange <= $change){
                            $criteria++;
                    }
    
            }
    return $criteria;
    }
    
    
    Code (markup):
     
    sadhakim, Feb 19, 2007 IP
  5. sadhakim

    sadhakim Peon

    Messages:
    7
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #5
    1. No joins or anything like that.. so that cant be the problem

    2. Not using MySQL 5, but i might be able to upgrade, would this be a huge speed advantage?

    3. Caching results in XML probably isnt very advantagous.. the same query wont be ran twice, so i doubt it is needed

    4. Not really sure what u mean :confused:
     
    sadhakim, Feb 19, 2007 IP
  6. rays

    rays Active Member

    Messages:
    563
    Likes Received:
    7
    Best Answers:
    0
    Trophy Points:
    58
    #6
    sadhakim can you please share mysql database structure ... so that we can understand exact relation ship within the tables and also data types you are using ...
     
    rays, Feb 20, 2007 IP
  7. sadhakim

    sadhakim Peon

    Messages:
    7
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #7
    very basic data structure... one table for each exchange,

    symbol varchar(4)
    time int unsigned
    open float
    close float
    high float
    low float
    dollar change float
    percentage change float
    volume int
    avg volume int
    unique(symbol, time)

    i'm looking at about 4 million entries into the table, still not very big, i doubt that the database structure is the problem..
     
    sadhakim, Feb 20, 2007 IP