Why is the performance for this Java code so inconsistent? -
i'm running small test although micro benchmark mimic doing in production pretty well.
i'm creating 2d array, 5 columns , 10,000,000 rows filled random integers between 0-19 inclusive. want sum numbers in 3rd column long value in 2nd column even. 100 times warm , 100 times , time how long takes.
on machine vast majority of time takes around 9 seconds, however, takes under 6 seconds.
it doesn't garbage collection, or jit compilation.
does have idea why faster occasionally?
i run code jdk7u11 on linux these arguments: -server -xx:+printcompilation -xms500m -xmx500m -verbose:gc -xx:+printgctimestamps -xx:+printgcdetails however, using various different jdks (from 6 way 8) , removing these parameters doesn't seem effect timings significantly.
here code:
import java.util.arraylist; import java.util.random; public class javaperformancetest { public static void main(string[] args) { int numcolumns = 5; int numrows = 10000000; int[][] data = new int[numcolumns][numrows]; random rand = new random(1234); (int j = 0; j < numcolumns; j++) { (int = 0; < numrows; i++) { data[j][i] = rand.nextint(20); } } int warmup = 100; arraylist<integer> sums = new arraylist<integer>(); system.out.println("warm " + warmup + " times"); long warmupstart = system.nanotime(); (int = 0; < warmup; i++) { sums.add(sum(numrows, data)); } long warmupend = system.nanotime(); system.out.println("warm complete " + (warmupend - warmupstart) / 1000000); int numberofruns = 100; int finalsum = 0; long starttime = system.nanotime(); (int = 0; < numberofruns; i++) { finalsum = sum(numrows, data); } long endtime = system.nanotime(); long diff = (endtime - starttime) / 1000000; system.out.println("time taken: " + diff + " sum: " + finalsum); } public static int sum(int numrows, int[][] columnbased) { int sum = 0; (int = 0; < numrows; i++) { if ((columnbased[1][i] % 2) == 0) { sum += columnbased[2][i]; } } return sum; } }
thanks, nick.
there number of possible causes slow performance including cache misses , failed branch prediction. make sure code optimal , repeat ensure result stable.
import java.util.arraylist; import java.util.random; public class javaperformancetest { public static void main(string[] args) { int numcolumns = 5; int numrows = 10000000; byte[][] data = new byte[numcolumns][numrows]; random rand = new random(1234); (int j = 0; j < numcolumns; j++) { (int = 0; < numrows; i++) { data[j][i] = (byte) rand.nextint(20); } } int warmup = 10; arraylist<integer> sums = new arraylist<integer>(); system.out.println("warm " + warmup + " times"); long warmupstart = system.nanotime(); (int = 0; < warmup; i++) { sums.add(sum(numrows, data)); } long warmupend = system.nanotime(); system.out.println("warm complete " + (warmupend - warmupstart) / 1000000); (int t = 0; t < 3; t++) { int numberofruns = 100; int finalsum = 0; long starttime = system.nanotime(); (int = 0; < numberofruns; i++) { finalsum = sum(numrows, data); } long endtime = system.nanotime(); long diff = (endtime - starttime) / 1000000; system.out.println("time taken: " + diff + " sum: " + finalsum); } } public static int sum(int numrows, byte[][] columnbased) { int sum = 0; byte[] col1 = columnbased[1]; byte[] col2 = columnbased[2]; (int = 0; < numrows; i++) // use multiplication instead of "if" avoid branch prediction failures sum += ((col1[i] + 1) & 1) * col2[i]; return sum; } }
prints
warm 10 times warm complete 109 time taken: 1006 sum: 47505460 time taken: 1006 sum: 47505460 time taken: 1026 sum: 47505460
in summary: optimising code improve performance far more playing command line parameters.