Is Scala idiomatic coding style just a cool trap for writing inefficient code?

后端未结

关注

 10  729

I sense that the Scala community has a little big obsession with writing \"concise\", \"cool\", \"scala idiomatic\", \"one-liner\" -if possible- code. This

相关标签:

10条回答

夕颜

2020-12-22 17:27
For reference, here's how splitAt is defined in TraversableLike in the Scala standard library,
```
def splitAt(n: Int): (Repr, Repr) = {
  val l, r = newBuilder
  l.sizeHintBounded(n, this)
  if (n >= 0) r.sizeHint(this, -n)
  var i = 0
  for (x <- this) {
    (if (i < n) l else r) += x
    i += 1
  }
  (l.result, r.result)
}
```
It's not unlike your example code of what a Java programmer might come up with.

I like Scala because, where performance matters, mutability is a reasonable way to go. The collections library is a great example; especially how it hides this mutability behind a functional interface.

Where performance isn't as important, such as some application code, the higher order functions in Scala's library allow great expressivity and programmer efficiency.

Out of curiosity, I picked an arbitrary large file in the Scala compiler (scala.tools.nsc.typechecker.Typers.scala) and counted something like 37 for loops, 11 while loops, 6 concatenations (++), and 1 fold (it happens to be a foldRight).
0 讨论(0)
发布评论:

提交评论
- 加载中...
栀梦

2020-12-22 17:28
```
def removeOneMax (xs: List [Int]) : List [Int] = xs match {                                  
    case x :: Nil => Nil 
    case a :: b :: xs => if (a < b) a :: removeOneMax (b :: xs) else b :: removeOneMax (a :: xs) 
    case Nil => Nil 
}
```
Here is a recursive method, which only iterates once. If you need performance, you have to think about it, if not, not.

You can make it tail-recursive in the standard way: giving an extra parameter carry, which is per default the empty List, and collects the result while iterating. That is, of course, a bit longer, but if you need performance, you have to pay for it:
```
import annotation.tailrec 
@tailrec
def removeOneMax (xs: List [Int], carry: List [Int] = List.empty) : List [Int] = xs match {                                  
  case a :: b :: xs => if (a < b) removeOneMax (b :: xs, a :: carry) else removeOneMax (a :: xs, b :: carry) 
  case x :: Nil => carry 
  case Nil => Nil 
}
```
I don't know what the chances are, that later compilers will improve slower map-calls to be as fast as while-loops. However: You rarely need high speed solutions, but if you need them often, you will learn them fast.

Do you know how big your collection has to be, to use a whole second for your solution on your machine?

As oneliner, similar to Daniel C. Sobrals solution:
```
((Nil : List[Int], xs(0)) /: xs.tail) ((p, x)=> if (p._2 > x) (x :: p._1, p._2) else ((p._2 :: p._1), x))._1
```
but that is hard to read, and I didn't measure the effective performance. The normal pattern is (x /: xs) ((a, b) => /* something */). Here, x and a are pairs of List-so-far and max-so-far, which solves the problem to bring everything into one line of code, but isn't very readable. However, you can earn reputation on CodeGolf this way, and maybe someone likes to make a performance measurement.

And now to our big surprise, some measurements:

An updated timing-method, to get the garbage collection out of the way, and have the hotspot-compiler warm up, a main, and many methods from this thread, together in an Object named
```
object PerfRemMax {

  def timed (name: String, xs: List [Int]) (f: List [Int] => List [Int]) = {
    val a = System.currentTimeMillis 
    val res = f (xs)
    val z = System.currentTimeMillis 
    val delta = z-a
    println (name + ": "  + (delta / 1000.0))
    res
  }

def main (args: Array [String]) : Unit = {
  val n = args(0).toInt
  val funs : List [(String, List[Int] => List[Int])] = List (
    "indexOf/take-drop" -> adrian1 _, 
    "arraybuf"      -> adrian2 _, /* out of memory */
    "paradigmatic1"     -> pm1 _, /**/
    "paradigmatic2"     -> pm2 _, 
    // "match" -> uu1 _, /*oom*/
    "tailrec match"     -> uu2 _, 
    "foldLeft"      -> uu3 _,
    "buf-=buf.max"  -> soc1 _, 
    "for/yield"     -> soc2 _,
    "splitAt"       -> daniel1,
    "ListBuffer"    -> daniel2
    )

  val r = util.Random 
  val xs = (for (x <- 1 to n) yield r.nextInt (n)).toList 

// With 1 Mio. as param, it starts with 100 000, 200k, 300k, ... 1Mio. cases. 
// a) warmup
// b) look, where the process gets linear to size  
  funs.foreach (f => {
    (1 to 10) foreach (i => {
        timed (f._1, xs.take (n/10 * i)) (f._2)
        compat.Platform.collectGarbage
    });
    println ()
  })
}
    
```
I renamed all the methods, and had to modify uu2 a bit, to fit to the common method declaration (List [Int] => List [Int]).

From the long result, i only provide the output for 1M invocations:
```
scala -Dserver PerfRemMax 2000000
indexOf/take-drop:  0.882
arraybuf:   1.681
paradigmatic1:  0.55
paradigmatic2:  1.13
tailrec match: 0.812
foldLeft:   1.054
buf-=buf.max:   1.185
for/yield:  0.725
splitAt:    1.127
ListBuffer: 0.61
```
The numbers aren't completly stable, depending on the sample size, and a bit varying from run to run. For example, for 100k to 1M runs, in steps of 100k, the timing for splitAt was as follows:
```
splitAt: 0.109
splitAt: 0.118
splitAt: 0.129
splitAt: 0.139
splitAt: 0.157
splitAt: 0.166
splitAt: 0.749
splitAt: 0.752
splitAt: 1.444
splitAt: 1.127
```
The initial solution is already pretty fast. splitAt is a modification from Daniel, often faster, but not always.

The measurement was done on a single core 2Ghz Centrino, running xUbuntu Linux, Scala-2.8 with Sun-Java-1.6 (desktop).

The two lessons for me are:
- always measure your performance improvements; it is very hard to estimate it, if you don't do it on a daily basis
- it is not only fun, to write functional code - sometimes the result is even faster
Here is a link to my benchmarkcode, if somebody is interested.
0 讨论(0)
发布评论:

提交评论
- 加载中...
不思量自难忘°

2020-12-22 17:28
Try this:
```
(myList.foldLeft((List[Int](), None: Option[Int]))) {
  case ((_, None),     x) => (List(),               Some(x))
  case ((Nil, Some(m), x) => (List(Math.min(x, m)), Some(Math.max(x, m))
  case ((l, Some(m),   x) => (Math.min(x, m) :: l,  Some(Math.max(x, m))
})._1
```
Idiomatic, functional, traverses only once. Maybe somewhat cryptic if you are not used to functional-programming idioms.

Let's try to explain what is happening here. I will try to make it as simple as possible, lacking some rigor.

A fold is an operation on a List[A] (that is, a list that contains elements of type A) that will take an initial state s0: S (that is, an instance of a type S) and a function f: (S, A) => S (that is, a function that takes the current state and an element from the list, and gives the next state, ie, it updates the state according to the next element).

The operation will then iterate over the elements of the list, using each one to update the state according to the given function. In Java, it would be something like:
```
interface Function<T, R> { R apply(T t); }
class Pair<A, B> { ... }
<State> State fold(List<A> list, State s0, Function<Pair<A, State>, State> f) {
  State s = s0;
  for (A a: list) {
    s = f.apply(new Pair<A, State>(a, s));
  }
  return s;
}
```
For example, if you want to add all the elements of a List[Int], the state would be the partial sum, that would have to be initialized to 0, and the new state produced by a function would simply add the current state to the current element being processed:
```
myList.fold(0)((partialSum, element) => partialSum + element)
```
Try to write a fold to multiply the elements of a list, then another one to find extreme values (max, min).

Now, the fold presented above is a bit more complex, since the state is composed of the new list being created along with the maximum element found so far. The function that updates the state is more or less straightforward once you grasp these concepts. It simply puts into the new list the minimum between the current maximum and the current element, while the other value goes to the current maximum of the updated state.

What is a bit more complex than to understand this (if you have no FP background) is to come up with this solution. However, this is only to show you that it exists, can be done. It's just a completely different mindset.

EDIT: As you see, the first and second case in the solution I proposed are used to setup the fold. It is equivalent to what you see in other answers when they do xs.tail.fold((xs.head, ...)) {...}. Note that the solutions proposed until now using xs.tail/xs.head don't cover the case in which xs is List(), and will throw an exception. The solution above will return List() instead. Since you didn't specify the behavior of the function on empty lists, both are valid.
0 讨论(0)
发布评论:

提交评论
- 加载中...
梦谈多话

2020-12-22 17:34
The biggest inefficiency when you're writing a program is worrying about the wrong things. This is usually the wrong thing to worry about. Why?
1. Developer time is generally much more expensive than CPU time — in fact, there is usually a dearth of the former and a surplus of the latter.
2. Most code does not need to be very efficient because it will never be running on million-item datasets multiple times every second.
3. Most code does need to bug free, and less code is less room for bugs to hide.
0 讨论(0)
发布评论:

提交评论
- 加载中...

春和景丽

2020-12-22 17:34

What about this?

def removeMax(xs: List[Int]) = {
  val buf = xs.toBuffer
  buf -= (buf.max)
}

A bit more ugly, but faster:

def removeMax(xs: List[Int]) = {
  var max = xs.head
  for ( x <- xs.tail ) 
  yield {
    if (x > max) { val result = max; max = x; result}
    else x
  }
}

0 讨论(0)

独厮守ぢ

2020-12-22 17:34

Another option would be:

package code.array

object SliceArrays {
  def main(args: Array[String]): Unit = {
    println(removeMaxCool(Vector(1,2,3,100,12,23,44)))
  }
  def removeMaxCool(xs: Vector[Int]) = xs.filter(_ < xs.max)
}

Using Vector instead of List, the reason is that Vector is more versatile and has a better general performance and time complexity if compared to List.

Consider the following collections operations: head, tail, apply, update, prepend, append

Vector takes an amortized constant time for all operations, as per Scala docs: "The operation takes effectively constant time, but this might depend on some assumptions such as maximum length of a vector or distribution of hash keys"

While List takes constant time only for head, tail and prepend operations.

Using

scalac -print

generates:

package code.array {
  object SliceArrays extends Object {
    def main(args: Array[String]): Unit = scala.Predef.println(SliceArrays.this.removeMaxCool(scala.`package`.Vector().apply(scala.Predef.wrapIntArray(Array[Int]{1, 2, 3, 100, 12, 23, 44})).$asInstanceOf[scala.collection.immutable.Vector]()));
    def removeMaxCool(xs: scala.collection.immutable.Vector): scala.collection.immutable.Vector = xs.filter({
  ((x$1: Int) => SliceArrays.this.$anonfun$removeMaxCool$1(xs, x$1))
}).$asInstanceOf[scala.collection.immutable.Vector]();
    final <artifact> private[this] def $anonfun$removeMaxCool$1(xs$1: scala.collection.immutable.Vector, x$1: Int): Boolean = x$1.<(scala.Int.unbox(xs$1.max(scala.math.Ordering$Int)));
    def <init>(): code.array.SliceArrays.type = {
      SliceArrays.super.<init>();
      ()
    }
  }
}

0 讨论(0)

1 2 下一页

Is Scala idiomatic coding style just a cool trap for writing inefficient code?

And now to our big surprise, some measurements: