Correct use of term Monoid

问题

From the following example, I think it is correct to say that String defines a monoid under the concatenation operation since it is an associative binary operation and String happens to have an identity element which is an empty string "".

scala> ("" + "Jane") + "Doe" ==  "" + ("Jane" + "Doe")
res0: Boolean = true

From the various texts I have been reading on the subject lately, it seems that the correct use of the term monoid is that the monoid is actually a combination of both the type (in this case String) and an instance of some monoid type which defines the operation and identity element.

For example, here is a theoretical Monoid type and a concrete instance of it as seems to be commonly defined in various books/articles:-

trait Monoid[A] { 
  def op(a1: A, a2: A): A 
  def zero: A 
} 

val stringMonoid = new Monoid[String] { 
  def op(a1: String, a2: String) = a1 + a2 
  val zero = "" 
}

I know that we do not need trait Monoid[A] nor stringMonoid to be defined in the core (Scala, or rather Java) library to support my REPL output and that the example is just a tool to understand the abstract concept of a monoid.

My issue (and I am very possibly thinking about it way too much) is the purist definition. I know that the underlying java.lang.String (or rather StringBuilder) already defines the associative operation, but I do not think that there is an explicit definition of the identity element (in this case just an empty string "") defined anywhere.

Question:-

Is String a monoid under the concatenation operation implicitly, just because we happen to know that using an empty string "" provides the identity? Or is it that an explicit definition of the identity element for a type is unnecessary for it to be classed as a monoid (under a particular associative binary operation).

回答1:

I think that you already understand the concept right and indeed "think too much" about it (;

A monoid is a triplet as they say in mathematics: a set (think of a type with its values), an associative binary operator on it and a neutral element. You define such triple — you define a monoid. So the answer to

Is String a monoid under the concatenation operation implicitly, just because we happen to know that using an empty string "" provides the identity?

is simply yes. You named the set, the associative binary operation and the neutral element — bingo! You got a monoid! About

already defines the associative operation, but I do not think that there is an explicit definition of the identity element

there's a bit of confusion. You can choose different associative operations on String with their corresponding neutral elements and define various monoids. Concatenation is not like some kind of "orthodox" associative operation on String to define monoid, just, probably, the most obvious.

If you define a table as something "with four legs and a flat horizontal surface on them", then anything that fits this definition is a table, regardless of the material it's made and other variable characteristics. Now, when do you need to "certify" it explicitly is a table? Only when you need to use its "table-properties", say if to sell it and advertise that you can put things on it and they won't fall off, because the surface is guaranteed to be flat and horizontal.

Sorry, if the example is kind of stupid, I'm not very good in such analogies. I hope it is still helpful.

Now about instantiating the mentioned "theoretical Monoid type". Such types are usually called a type class. Neither existence of such type itself (which can be defined in various ways), nor its instance are necessary to call the triple (String, (_ ++ _), "") a monoid and reason about it as a monoid (i.e. use general monoid properties). What it is actually used for is ad-hoc polymorphism. In Scala it is done with implicits. One can, for example, define a polymorphic function

def fold[M](seq: Seq[M])(implicit m: Monoid[M]): M = seq match {
  case Seq.empty => m.zero
  case (h +: t)  => m.op(h, fold(t))
}

Then if your stringMonoid value is declared as implicit val, whenever you use fold[String] and stringMonoid is in scope, it will use its zero and op inside. Same fold definition will work for other instances of Monoid[...].

Another topic is what happens when you have several instances of Monoid[String]. Read Where does Scala look for implicits?.

回答2:

You may be overthinking this a little. A monoid is an abstract concept that exists outside of programming and type systems (a la category theory). Consider the definition:

A monoid is a set that is closed under an associative binary operation and has an identity element.

You have identified the String type to be a closed set with an associative binary operation (concatenation) and identity element (the empty string). The language may not explicitly tell you what the identity element is, but that doesn't mean that it doesn't exist (we know it does). The String type with the binary operation of concatenation is most certainly a monoid because you can prove that it meets all of the aforementioned properties.

Creating a Monoid type class is merely for our convenience when it comes to working with generic data structures that operate with monoids. Regardless of whether or not the programming language (or some other library) explicitly spells out what sets with what binary operations and what identities construct monoids, the monoids can still exist.

It is important to note that the String type on its own does not constitute a monoid, as it must exist with said binary operation, etc. It may be possible that another monoid exists using the same set, but a different binary operation.

回答3:

To answer your question, first we need to answer another one: what is a monoid? One can look at it from at least 2 different perspectives:

Category theory perspective,
Scala programming language perspective.

It is important though to not mix these 2 perspectives. Category theory is a form of mathematics, and the monoid in category theory therefore belongs to the domain of Platonic Forms, while in Scala programming language, according to Platonic realism, it is a Particular.

So monoid in Scala is a mere reflection of an ideal Platonic monoid. But we are free to choose how exactly we do this reflection, because anyways the Particular is a mere approximation of a Form. This reflection is based on the expressive powers of the medium, in which this reflection happens (Scala programming language in our case).

Monoid requires us to designate a unique instance of a given type with the "identity" notion. I'd argue that the best way to uniquely identify an instance of some type is to use a singleton type, because types, as opposed to values, are guaranteed to be unique. Since Scala doesn't have the singleton types proper (see here for discussion), it's impossible to clearly express monoid Form in Scala, and this is probably where you question stems from.

The Scala's answer to this problem is the typeclasses. Scala says: "Ok, I can't express the empty string as an unique type (neither 0 as identity of numbers under sum, etc.), so I won't operate on that level at all. Instead, convince me you have a monoid typeclass for String, and I'll let you use it where appropriate." This is a very practical approach, but not very pure. Having singleton types would allow one to express an empty string as a unique type, and the string monoid like this (warning, this doesn't compile):

// Identity typeclass
trait Identity[T] {
  type Repr
  val identity: Repr
}

object Identity {
  implicit val stringIdentity = new Identity {
    type I = "" // can't do this without singleton types support
    val identity = ""
  }
}

trait Monoid[A] { 
  def op(a1: A, a2: A): A 
  def zero: Identity[A]#Repr
} 

object Monoid {
  implicit def stringMonoid[I](implicit 
    stringIdentity: Identity[String] { type Repr = I }) = new Monoid[String] { 
      def op(a1: String, a2: String): String = a1 + a2 
      def zero: I = stringIdentity.identity
  }
}

I think it would be possible to get pretty close to this using shapeless, but I haven't tried.

To summarize, from Scala language perspective, having monoid means having an instance of monoid typeclass, because so we, the implementers, choose to reflect the ideal monoid into Scala's reality. Your question mixes ideal monoid with real-world Scala monoid, and therefore it's so hard to formulate and answer. The lack of singleton types in Scala forces us to make assumptions, like that the empty string is the identity for the String type. With singleton types, we don't have to assume, but we can prove this fact, and use such proof in the monoid definition.

Hope this helps.

回答4:

You would only need String (as implemented in the JVM, or as supplemented in the SDK) to be a proper Monoid (e.g. RichString extends Monoid[String]) if you were going to have String (or RichString) method implementations that you take directly from Monoid, so you do not have to implement them. That is, if we know that a String is a Monoid, then it has all its operations without us having to implement them. There are two problems with this scenario: first, String was implemented in the JVM a long time before anyone thought to recognize that String is a monoid, so all useful methods have been written directly into String, or StringBuilder. Second, Monoid probably does not have terribly useful methods anyway.

来源：https://stackoverflow.com/questions/40370430/correct-use-of-term-monoid

标签

scala

functional-programming