Which Scala features have poor performance -

- February 15, 2014

i wandering lately: scala run on jvm, , latter optimized types of operations, there features implementation inefficient on jvm , use therefore should discouraged? explain why inefficient?

the first candidate functional programming features - know, functions special classes applymethod, creates additional overhead compared languages functions blocks of code.

performance tuning deep , complex issue, 3 things come mind.

scala collections expressive power, not performance.

consider:

(1 20).map(x => x*x).sum  val = new array[int](20) var = 0 while (i < 20) { a(i) = i+1; += 1 }  // (1 20) = 0 while (i < 20) { a(i) = a(i)*a(i); += 1 }   // map(x => x*x) var s = 0 = 0 while (i < 20) { s += a(i); += 1 }  // sum s

the first amazingly more compact. second 16x faster. math on integers fast; boxing , unboxing not. generic collections code is, well, generic, , relies on boxing.

function2 specialized on int, long, , double arguments.

anything other operation on primitives require boxing. beware!

suppose want have function can toggle capability--maybe want capitalize letters or not. try:

def doodd(a: array[char], f: (char, boolean) => char) = {   var = 0   while (i<a.length) { a(i) = f(a(i), (i&1)==1); += 1 }   }

and you

val text = "the quick brown fox jumps on lazy dog".toarray val f = (c: char, b: boolean) => if (b) c.toupper else c.tolower  scala> println( doodd(text, f).mkstring ) quick brown fox jumps on lazy dog

okay, great! except if we

trait func_cb_c { def apply(c: char, b: boolean): char } val g = new func_cb_c {   def apply(c: char, b: boolean) = if (b) c.toupper else c.tolower } def doodd2(a: array[char], f: func_cb_c) = {   var = 0   while (i<a.length) { a(i) = f(a(i), (i&1)==1); += 1 }   }

instead? it's 3x faster. if it's (int, int) => int, (or other permutation of int/long/double arguments , unit/boolean/int/long/float/double return values), rolling own unnecessary--it's specialized , works @ maximum speed.

just because can parallelize doesn't mean it's idea.

scala's parallel collections try run code in parallel. it's make sure there's enough work running in parallel smart thing do. there's lot of overhead in setting threads , collecting results. take, example,

val v = (1 1000).to[vector] v.map(x => x*(x+1))

versus

val u = (1 1000).to[vector].par u.map(x => x*(x+1))

the second map faster, right, because it's parallel?

hardly! it's 10x slower because of overhead (on machine; results can vary substantially)

summary

these few of many issues you'll never have worry except in performance-critical parts of code. there oodles more, you'll encounter, mentioned in comment, take book cover decent fraction of them. note there oodles of performance issues in any language, , optimization tricky. save effort matters!

Search This Blog

Employment & Recruiting