I have found AtomicInteger
, AtomicLong
, but where is AtomicFloat
(or AtomicDouble
)? Maybe there is some trick?
I'm also surprised there wasn't a built-in solution. The use-case is to get the floating-point sum of values emitted by a collection of concurrent threads without memory use scaling with the number of values. For instance, the concurrent threads are prediction engines and you want to monitor the sum of predicted-minus-truth residuals from all prediction engines in one place. Simultaneous attempts to add to a naive counter would result in lost counts (in exactly the same way as integer counters).
A ConcurrentLinkedQueue
can collect the values to sum, but unless there's a thread dedicated to reducing that queue (constantly running result += q.poll()
until poll returns null
, then q.add(result)
and wait a moment for it to fill up again), the size of the queue would grow to the number of values to sum.
Java 8 has DoubleAdder
and Guava has AtomicDouble
(see comments on other questions), but that doesn't help library developers targeting old Java with minimal dependencies. I looked at a sample of DoubleAdder code and AtomicDouble code, and what I found surprised me: they just retry addition followed by compareAndSet
until doing so is not erroneous. The number of threads attempting to write can increase while there's contention, but unless they're in perfect lock-step, some will win the race and get out of the way while others keep retrying.
Here's a Scala implementation of what they do:
class AtomicDouble {
private val value = new AtomicReference(java.lang.Double.valueOf(0.0))
@tailrec
final def getAndAdd(delta: Double): Double = {
val currentValue = value.get
val newValue = java.lang.Double.valueOf(currentValue.doubleValue + delta)
if (value.compareAndSet(currentValue, newValue))
currentValue.doubleValue
else
getAndAdd(delta) // try, try again
}
}
and an attempted Java translation:
class AtomicDouble {
private AtomicReference value = new AtomicReference(Double.valueOf(0.0));
double getAndAdd(double delta) {
while (true) {
Double currentValue = value.get();
Double newValue = Double.valueOf(currentValue.doubleValue() + delta);
if (value.compareAndSet(currentValue, newValue))
return currentValue.doubleValue();
}
}
}
It works (Scala version tested with hundreds of threads), and provides a way to generalize from Double
.
However, I don't see any reason why this would be faster or preferred over synchronizing on write only. A blocking solution would also make some threads wait while others increment the counter, but with a guarantee that all will eventually finish (no reliance on imperfect timing) and no wasted CPU (don't compute the sum until you know you're allowed to update it). So why do this?