Pseudo-code help PLEASE!!!

Discussion in 'Programming' started by Frinkenstein, Feb 22, 2008.

  1. #1
    Hi there,
    I'm working on a PhD in an engineering field and it requires a bit of programming. My programming knowledge is extremely limited so I'm having a bit of a tough time with it. Actually, I'm totally stumped here.

    I'm working on a program that parameterizes the sampling distribution for the correlation coefficient. Anyway, most of it is programmed already. But, I'm having a problem with one part of the program.

    Say I have some data with 11 X and Y values. I want to calculate the correlation coefficient between them. OK, that's no problem. I've already coded that.

    Now, let's say I want to leave out 1 X and Y data pair so that I'm calculating the correlation coefficient based on 10 pairs of X and Y instead of 11. And, I want to do that 11 times, leaving out a different data pair each time so that I calculate 11 different correlation coefficients based on the remaining 10 data points.

    Now, I want to do the same thing, but instead of leaving out only 1 data pair, I want to leave out 2, 3, 4, ... n-3 data points.

    Why up to n-3 data pairs? Because you need 3 data pairs to calculate a correlation coefficient.


    If it matters, I am coding in Fortran. I know it's an old language, but it works efficiently for many of our applications and we have built up an existing library of programs in that language that have served us well and still work.

    I'm not necessarily looking for you to write my code for me and maybe there aren't many of you that know Fortran very well, but I would really appreciate some help with some pseudo code at least.

    Thanks so much, in advance for any help!!! I'm really stumped with this one.
     
    Frinkenstein, Feb 22, 2008 IP
  2. Frinkenstein

    Frinkenstein Peon

    Messages:
    27
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #2
    Just to re-iterate, my main problem is figuring out how to "leave out" the various different combinations of data from the calculation.
     
    Frinkenstein, Feb 22, 2008 IP
  3. walkere

    walkere Active Member

    Messages:
    112
    Likes Received:
    7
    Best Answers:
    0
    Trophy Points:
    58
    #3
    Hmm... that's an interesting problem. I don't know any FORTRAN, so I can't be any help on that front. But here's a concept to get you started.

    Let's say we're leaving one data set out. That's pretty simple.

    You'll need to perform an outer loop eleven times (once for each "combination" of one pair left out).

    Inside that loop, you can loop through the 11 pairs. When you get to the pair whose # of is equal to the outer loop's number, you ignore it. Otherwise calculate the coefficient of the other 10 pairs.

    //  Outer loop
    for ($x = 1; $x <= 11; $x++) {
      //  Perform all the calculations, but ignore the pair at position x
      for ($y = $1; $y <= 11; $y++) {
        if ($y != $x) {
          //  Use this pair for the calculations
        } else {
          //  Do nothing, we're skipping this one
        }
      }
    }
    PHP:
    The key to extending this is knowing all of the combinations we can get by choosing x from 11.

    If we're ignoring two data pairs, we should have 55 possible combinations. Now imagine that we had an array with 55 elements - each one contains the list of combinations (1, 2; 1, 3; 1, 4; 1, 5; ... 10, 11).

    We can use this to repeat the same type of loop for 2 data pairs, 3, 4, etc, up to n - 3.

    // Loop through, one iteration per combination
    for ($x = 1; $x <= # of combinations; $x++) {
      //  Now loop through the 11 data points, and ignore any data point whose # is in the current combination
      for ($y = 1; $y <= 11; $y++) {
        if ($y is not in arrayOfCombinations[$x]) {
          //  $y isn't being excluded on this iteration.  Use this pair for the calculations
        } else {
          //  Ignore $y this time and continue to the next pair
        }
      }
    }
    PHP:
    To do this, you'd need to write or find a function that can return a set of combinations for (x choose y).

    Hope that helps somewhat. Good luck,
    - Walkere
     
    walkere, Feb 22, 2008 IP
  4. able

    able Peon

    Messages:
    44
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #4
    Walkere is on the right lines however I think you should flip the problem.

    Don't think of it as removing pairs, think of it as selecting N pairs from the set.

    When you think of it this way its quite a standard combinatorial problem.

    Would advise you check 'The Art of Programming' by Edward Knuth - it'll be in your library, he summarises the efficiency of numerous combination algorithms.

    You could also just google for fortran combination generator, I'm pretty sure you'll find something :)
     
    able, Feb 22, 2008 IP
  5. Frinkenstein

    Frinkenstein Peon

    Messages:
    27
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Thanks to both of you for your help so far.

    Walkere: Your code makes sense to me. And, I was thinking of something similar.

    Then the problem becomes, how can I generate the "arrayofcombinations(x)"? Because, essentially, if I have n data pairs, I will need n-2 different arraysofcombinations.

    i.e. if I have 6 data pairs I need the following arrays of combinations:
    -an array of zeros (actually, I don't really need this as my program already calculates the correlation based on the full data set).
    -An array with 1 data point left out (1;2;3;4;5;6)
    -An array with 2 data points left out (1,2;1,3;1,4;1,5;1,6;2,3;2,4;2,5;2,6;3,4;3,5;3,6;4,5;4,6;5,6)
    -An array with 3 data points left out (1,2,3;1,2,4;1,2,5;1,2,6;1,3,4;1,3,5;1,3,6;1,4,5;1,4,6;1,5,6;2,3,4;2,3,5;2,3,6;2,4,5;2,4,6;2,5,6;3,4,5;3,4,6;3,5,6;4,5,6)

    To me, this is as hard as the original problem.
     
    Frinkenstein, Feb 22, 2008 IP
  6. Frinkenstein

    Frinkenstein Peon

    Messages:
    27
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #6
    Arrays can be dynamically allocated in Fortran95, so that's not a problem. But, the problem is how can I generate n-2 arrays where n is not known.
     
    Frinkenstein, Feb 22, 2008 IP
  7. walkere

    walkere Active Member

    Messages:
    112
    Likes Received:
    7
    Best Answers:
    0
    Trophy Points:
    58
    #7
    It's kind of six of one, half dozen of the other. Either way would work. My original suggestion uses the same combinatorial problem - except I'm using the combinations to exclude pairs while you're suggesting that the combinations should be used to include the pairs.

    It'll also end up being the same amount of load either way - since when you invert the binomial coefficient, you get the same amount of combinations. In other words, (11 Choose 4) is the same # of combinations as (4 Choose 11).

    Yes it is. That's why I stopped here before... =)

    I managed to work out a php script that properly iterates every possible set given (xCy). The source code for the script is at the bottom.

    The concept is this:

    First, we create an empty array. This will be made up of other arrays - each of which represents one possible combination set.

    We built a function - getNextCombo - which takes $total (# of items), $choose (# to be chosen), and $lastSet (the last combination result set).

    If $lastSet is empty (i.e. this is the first iteration), we initialize it with a sequential list of numbers starting at 1. Otherwise, we increment the result set by one.

    It starts by looking at the last digit ($lastSet[$choose]). If that is equal to the maximum # of items ($total), it increases a counter. Otherwise, we move on to the next step.

    The idea here is that we go back to the deepest number that isn't maxed out. Then we increment it by one. Then we set all the deeper numbers to be one greater than that.

    For example, let's say we're choosing 4 from 8 and $lastSet is the set {1, 2, 7, 8}, we have to go back to the second digit (2) to find one we can increment. The 4th and 3rd digit are maxed out.

    We can then increment the 2nd digit (to 3), and make the 3rd and 4th digit one greater than that. Giving us the new set {1, 3, 4, 5}.

    Once we get to the point where we can't increment any numbers (the set {5, 6, 7, 8}, we finish!

    I'm sure this isn't the most efficient way to perform this function. In fact it can eat up a lot of memory with all the arrays. The biggest set I was able to process was (20 C 10) - 184,756 combinations.

    After that, the script used the max memory limit of 135mb. You can bypass this by writing the data to a file, though, because the only thing you need in working memory is the last combination set.

    Hope that made some sense and helps in some way. Good luck porting it to or something similar to Fortran!

    - Walkere

    <?php
    function getNextCombo($total, $choose, $lastSet) {
    	if (empty($lastSet)) {
    		//  Initialize with a straight combination 1 to $choose
    		for ($x = 1; $x <= $choose; $x++) {
    			$lastSet[$x] = $x;
    		}
    		
    		return $lastSet;
    	} else {
    		$moveBack = 0;
    		while (($lastSet[$choose - $moveBack] == $total - $moveBack) && ($moveBack < $choose)) {
    			$moveBack++;
    		}
    		
    		if ($moveBack != $choose) {
    			$lastSet[$choose - $moveBack]++;
    			while ($moveBack > 0) {
    				$moveBack--;
    				$lastSet[$choose - $moveBack] = $lastSet[$choose - ($moveBack + 1)] + 1;			
    			}
    				
    			return $lastSet;
    		
    		} else {
    			return 0;	
    		}
    
    	}
    	
    	return 0;
    }
    
    $combines = array();
    $total = 11;
    $choose = 5;
    
    $temp = array();
    while ($temp = getNextCombo($total, $choose, $temp)) {
    	$combines[] = $temp;
    }
    
    foreach ($combines as $key => $val) {
    	echo $key + 1 . ' : ';
    	foreach ($val as $number) {
    		echo $number . " ";
    	}
    	echo "<br />";
    }
    ?>
    PHP:
     
    walkere, Feb 22, 2008 IP
    able likes this.
  8. Frinkenstein

    Frinkenstein Peon

    Messages:
    27
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #8
    Holy crap Walkere, you ROCK!!! I'm going to give that a whirl and see if I can make it work.

    Thanks again for all the help. Truly, you've gone above and beyond here!

    Cheers.
     
    Frinkenstein, Feb 22, 2008 IP
  9. walkere

    walkere Active Member

    Messages:
    112
    Likes Received:
    7
    Best Answers:
    0
    Trophy Points:
    58
    #9
    No problem. I love a good challenge.

    It's been a while since I tried to wrap my brain around something mind-numbingly math oriented. =)

    - Walkere
     
    walkere, Feb 22, 2008 IP
  10. Frinkenstein

    Frinkenstein Peon

    Messages:
    27
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #10
    OK, I've had a chance to read and digest your code and I think I understand the getNextCombo function and I've translated that into Fortran. But, I don't understand the some of the php syntax in the main part.

    i.e. this part:

    
    $combines = array();
    $total = 11;
    $choose = 5;
    
    $temp = array();
    while ($temp = getNextCombo($total, $choose, $temp)) {
    	$combines[] = $temp;
    }
    
    foreach ($combines as $key => $val) {
    	echo $key + 1 . ' : ';
    	foreach ($val as $number) {
    		echo $number . " ";
    	}
    	echo "<br />";
    }
    ?>
    PHP:
    [/QUOTE]

    Is the term: $combines = array() just declaring combines as an array?

    Basically, it looks as though you're calling the getnextcombo to generate the combinatorial. Then you're assigning getnextcombo to the array temp (which is 1 by 5?). Then you're assigning temp to a row within the combines array. This process is repeated until combines is filled up with 462 rows of combinations?

    Then you're writing out an index of the combination number(i.e. 1 through 462 combinations) followed by the getnextcombo?

    Is that about right?

    I'm not quite sure how key, val and number fit in though.

    Thanks again
     
    Frinkenstein, Feb 22, 2008 IP
  11. able

    able Peon

    Messages:
    44
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #11
    Great work walkere :)
     
    able, Feb 23, 2008 IP
  12. walkere

    walkere Active Member

    Messages:
    112
    Likes Received:
    7
    Best Answers:
    0
    Trophy Points:
    58
    #12
    Yes, that just initializes it as an empty array.

    Yes, $temp is always holding the latest combination retrieved. The size of $temp (and the array returned by getNextCombo) depends on how many numbers your choosing. The array will always be one dimensional with a # of elements equal to the $choose value passed to getNextCombo.

    So if you were doing (11 Choose 5), the array would have 5 values.

    And yes, we stop automatically when we hit the last possible combination. The getNextCombo() function returns 0 when there are no more possible combinations to find. The loop "while ($temp = getNextCombo() )" operates as long as $temp is receiving a value - so when getNextCombo() returns 0, the loop ends.

    The $combines[] array is holding all of those combinations so you can do some processing with them later. It doesn't actually help drive the combination process here.

    Yeah, the loops at the end are just output so that I could verify that the script was working.

    Sounds like you've got the gist of it.

    The $key, $val, and $number variables are all generated by PHP's foreach loop. It's a shorthand loop that iterates through every element of an array (good for outputting).

    The line "foreach ($combines as $key => $val)" tells PHP to loop through every element of the $combines array. The $key variable will hold the array key of the current element and the $val variable will hold the value of that variable.

    That value in $val is also an array (remember we pushed the $temp array into each row of $combines), so the next foreach is doing the same thing. "foreach ($val as $number)" tells PHP to iterate through the $val array (which holds our combination set) and $number will represent each value of that array.

    Glad you could make sense of it. Good luck,
    - Walkere
     
    walkere, Feb 23, 2008 IP