A grasp of immutable datastructures

问题

I am learning scala and as a good student I try to obey all rules I found.

One rule is: IMMUTABILITY!!!

So I have tried to code everything with immutable data structures and vals, and sometimes this is really hard.

But today I thought to myself: the only important thing is that the object/class should have no mutable state. I am not forced to code all methods in an immutable style, because these methods don't affect each other.

My Question: Am I correct or are there any problems/disadvantages I dont see?

EDIT:

Code example for aishwarya:

def logLikelihood(seq: Iterator[T]): Double = {
  val sequence = seq.toList
  val stateSequence = (0 to order).toList.padTo(sequence.length,order)
  val seqPos = sequence.zipWithIndex

  def probOfSymbAtPos(symb: T, pos: Int) : Double = {
    val state = states(stateSequence(pos))
    M.log(state( seqPos.map( _._1 ).slice(0, pos).takeRight(order), symb))
  }

  val probs = seqPos.map( i => probOfSymbAtPos(i._1,i._2) )

  probs.sum
}

Explanation: It is a method to calculate the log-likelihood of a homogeneous Markov model of variable order. The apply method of state takes all previous symbols and the coming symbol and returns the probability of doing so.

As you may see: the whole method is just multiplying some probabilities which would be much easier using vars.

回答1:

The rule is not really immutability, but referential transparency. It's perfectly OK to use locally declared mutable variables and arrays, because none of the effects are observable to any other parts of the overall program.

The principle of referential transparency (RT) is this:

An expression e is referentially transparent if for all programs p every occurrence of e in p can be replaced with the result of evaluating e, without affecting the observable result of p.

Note that if e creates and mutates some local state, it doesn't violate RT since nobody can observe this happening.

That said, I very much doubt that your implementation is any more straightforward with vars.

回答2:

The case for functional programming is one of being concise in your code and bringing in a more mathematical approach. It can reduce the possibility of bugs and make your code smaller and more readable. As for being easier or not, it does require that you think about your problems differently. But once you get use to thinking with functional patterns it's likely that functional will become easier that the more imperative style.

It is really hard to be perfectly functional and have zero mutable state but very beneficial to have minimal mutable state. The thing to remember is that everything needs to done in balance and not to the extreme. By reducing the amount of mutable state you end up making it harder to write code with unintended consequences. A common pattern is to have a mutable variable whose value is immutable. This way identity ( the named variable ) and value ( an immutable object the variable can be assigned ) are seperate.

var acc: List[Int] = Nil
// lots of complex stuff that adds values
acc ::= 1
acc ::= 2
acc ::= 3
// do loop current list
acc foreach { i => /* do stuff that mutates acc */ acc ::= i * 10 }
println( acc ) // List( 1, 2, 3, 10, 20, 30 )

The foreach is looping over the value of acc at the time we started the foreach. Any mutations to acc do not affect the loop. This is much safer than the typical iterators in java where the list can change mid iteration.

There is also a concurrency concern. Immutable objects are useful because of the JSR-133 memory model specification which asserts that the initialization of an objects final members will occur before any thread can have visibility to those members, period! If they are not final then they are "mutable" and there is no guarantee of proper initialization.

Actors are the perfect place to put mutable state. Objects that represent data should be immutable. Take the following example.

object MyActor extends Actor {
  var acc: List[Int] = Nil
  def act() {
    loop {
      react {
        case i: Int => acc ::= i
        case "what is your current value" => reply( acc )
        case _ => // ignore all other messages
      }
    }
  }
}

In this case we can send the value of acc ( which is a List ) and not worry about synchronization because List is immutable aka all of the members of the List object are final. Also because of the immutability we know that no other actor can change the underlying data structure that was sent and thus no other actor can change the mutable state of this actor.

回答3:

Since Apocalisp has already mentioned the stuff I was going to quote him on, I'll discuss the code. You say it is just multiplying stuff, but I don't see that -- it makes reference to at least three important methods defined outside: order, states and M.log. I can infer that order is an Int, and that states return a function that takes a List[T] and a T and returns Double.

There's also some weird stuff going on...

def logLikelihood(seq: Iterator[T]): Double = {
  val sequence = seq.toList

sequence is never used except to define seqPos, so why do that?

  val stateSequence = (0 to order).toList.padTo(sequence.length,order)
  val seqPos = sequence.zipWithIndex

  def probOfSymbAtPos(symb: T, pos: Int) : Double = {
    val state = states(stateSequence(pos))
    M.log(state( seqPos.map( _._1 ).slice(0, pos).takeRight(order), symb))

Actually, you could use sequence here instead of seqPos.map( _._1 ), since all that does is undo the zipWithIndex. Also, slice(0, pos) is just take(pos).

  }

  val probs = seqPos.map( i => probOfSymbAtPos(i._1,i._2) )

  probs.sum
}

Now, given the missing methods, it is difficult to assert how this should really be written in functional style. Keeping the mystery methods would yield:

def logLikelihood(seq: Iterator[T]): Double = {
  import scala.collection.immutable.Queue
  case class State(index: Int, order: Int, slice: Queue[T], result: Double)

  seq.foldLeft(State(0, 0, Queue.empty, 0.0)) {
    case (State(index, ord, slice, result), symb) =>
      val state = states(order)
      val partial = M.log(state(slice, symb))
      val newSlice = slice enqueue symb
      State(index + 1, 
            if (ord == order) ord else ord + 1, 
            if (queue.size > order) newSlice.dequeue._2 else newSlice,
            result + partial)
  }.result
}

Only I suspect the state/M.log stuff could be made part of State as well. I notice other optimizations now that I have written it like this. The sliding window you are using reminds me, of course, of sliding:

seq.sliding(order).zipWithIndex.map { 
  case (slice, index) => M.log(states(index + order)(slice.init, slice.last))
}.sum

That will only start at the orderth element, so some adaptation would be in order. Not too difficult, though. So let's rewrite it again:

def logLikelihood(seq: Iterator[T]): Double = {
  val sequence = seq.toList
  val slices = (1 until order).map(sequence take) ::: sequence.sliding(order)
  slices.zipWithIndex.map { 
    case (slice, index) => M.log(states(index)(slice.init, slice.last))
  }.sum
}

I wish I could see M.log and states... I bet I could turn that map into a foldLeft and do away with these two methods. And I suspect the method returned by states could take the whole slice instead of two parameters.

Still... not bad, is it?

来源：https://stackoverflow.com/questions/8346119/a-grasp-of-immutable-datastructures

标签

scala

data-structures

functional-programming

immutability