Why is the performance for this Java code so inconsistent? -


i'm running small test although micro benchmark mimic doing in production pretty well.

i'm creating 2d array, 5 columns , 10,000,000 rows filled random integers between 0-19 inclusive. want sum numbers in 3rd column long value in 2nd column even. 100 times warm , 100 times , time how long takes.

on machine vast majority of time takes around 9 seconds, however, takes under 6 seconds.

it doesn't garbage collection, or jit compilation.

does have idea why faster occasionally?

i run code jdk7u11 on linux these arguments: -server -xx:+printcompilation -xms500m -xmx500m -verbose:gc -xx:+printgctimestamps -xx:+printgcdetails however, using various different jdks (from 6 way 8) , removing these parameters doesn't seem effect timings significantly.

here code:

import java.util.arraylist; import java.util.random;  public class javaperformancetest {     public static void main(string[] args) {         int numcolumns = 5;         int numrows = 10000000;         int[][] data = new int[numcolumns][numrows];         random rand = new random(1234);         (int j = 0; j < numcolumns; j++) {             (int = 0; < numrows; i++) {                 data[j][i] = rand.nextint(20);             }         }         int warmup = 100;         arraylist<integer> sums = new arraylist<integer>();         system.out.println("warm " + warmup + " times");         long warmupstart = system.nanotime();         (int = 0; < warmup; i++) {             sums.add(sum(numrows, data));         }         long warmupend = system.nanotime();         system.out.println("warm complete " + (warmupend - warmupstart) / 1000000);         int numberofruns = 100;         int finalsum = 0;         long starttime = system.nanotime();         (int = 0; < numberofruns; i++) {             finalsum = sum(numrows, data);         }         long endtime = system.nanotime();         long diff = (endtime - starttime) / 1000000;         system.out.println("time taken: " + diff + "    sum: " + finalsum);     }       public static int sum(int numrows, int[][] columnbased) {         int sum = 0;         (int = 0; < numrows; i++) {             if ((columnbased[1][i] % 2) == 0) {                 sum += columnbased[2][i];             }         }         return sum;     } } 

thanks, nick.

there number of possible causes slow performance including cache misses , failed branch prediction. make sure code optimal , repeat ensure result stable.

import java.util.arraylist; import java.util.random;  public class javaperformancetest {     public static void main(string[] args) {         int numcolumns = 5;         int numrows = 10000000;         byte[][] data = new byte[numcolumns][numrows];         random rand = new random(1234);         (int j = 0; j < numcolumns; j++) {             (int = 0; < numrows; i++) {                 data[j][i] = (byte) rand.nextint(20);             }         }         int warmup = 10;         arraylist<integer> sums = new arraylist<integer>();         system.out.println("warm " + warmup + " times");         long warmupstart = system.nanotime();         (int = 0; < warmup; i++) {             sums.add(sum(numrows, data));         }         long warmupend = system.nanotime();         system.out.println("warm complete " + (warmupend - warmupstart) / 1000000);         (int t = 0; t < 3; t++) {             int numberofruns = 100;             int finalsum = 0;             long starttime = system.nanotime();             (int = 0; < numberofruns; i++) {                 finalsum = sum(numrows, data);             }             long endtime = system.nanotime();             long diff = (endtime - starttime) / 1000000;             system.out.println("time taken: " + diff + "    sum: " + finalsum);         }     }       public static int sum(int numrows, byte[][] columnbased) {         int sum = 0;         byte[] col1 = columnbased[1];         byte[] col2 = columnbased[2];         (int = 0; < numrows; i++)             // use multiplication instead of "if" avoid branch prediction failures             sum += ((col1[i] + 1) & 1) * col2[i];         return sum;     } } 

prints

warm 10 times warm complete 109 time taken: 1006    sum: 47505460 time taken: 1006    sum: 47505460 time taken: 1026    sum: 47505460 

in summary: optimising code improve performance far more playing command line parameters.


Popular posts from this blog

How to calculate SNR of signals in MATLAB? -

c# - Attempting to upload to FTP: System.Net.WebException: System error -

ios - UISlider customization: how to properly add shadow to custom knob image -