group or count duplicated letters in elixir

爷,独闯天下 提交于 2019-12-10 18:42:26

问题


I'm trying to count duplicated letters on a String in Elixir. I did try some attempts, but no success until now.

Let's take this string as example: "AAABBAAC"

The desired output would be "3A2B2A1C".

Converting this string to a List, I was able to count every letter, resulting in "5A2B1C", but I have to count following the order.

This is the code I was doing:

string
|> String.graphemes
|> Enum.reduce([], fn(letter, acc) -> Keyword.update(acc, letter, 1, &(&1 + 1)) end)

But, in my tests, I'm trying to produce a List, like this ["AAA", "BB", "AA", "C"], so I can easely count with String.lenght.

Is there a way to produce this?

Thanks in advance.

UPDATE:

Looks like using Enum.chunk_by I'm getting closer to a solution.

UPDATE 2:

Someone can tell me why this question was flagged to -1? As you can see, I'm very new to StackOverflow, so I want to do this the right way.

UPDATE 3:

Added some code to main question, following the best practices in community, to avoid confusion and down votes for off-topics. Anyway, this question is already solved.


回答1:


If you implement this using a recursive approach, you can easily keep track of the last occurred character and its current count, as well an accumulator that holds the result so far. If the current character equals the last character you just increase the count. If the two differ, you add the last character and its count to the accumulator and proceed with the next character until the string is empty. Finally, you encode the final value and return the result.

defmodule RunLengthEncoding do
  # public interface, take first char and remember it as the current value
  def encode(<<char::utf8, rest::binary>>) do
    do_encode(rest, char, 1, "")
  end

  # current == last, increase the count and proceed
  defp do_encode(<<char::utf8, rest::binary>>, char, count, acc) do
    do_encode(rest, char, count + 1, acc)
  end

  # current != last, reset count, encode previous values and proceed
  defp do_encode(<<char::utf8, rest::binary>>, last, count, acc) do
    do_encode(rest, char, 1, acc <> to_string(count) <> <<last::utf8>>)
  end

  # input empty, encode final values and return
  defp do_encode("", last, count, acc) do
    acc <> to_string(count) <> <<last::utf8>>
  end
end



回答2:


According to Help Center > Answering, I did solve this way:

string
|> String.graphemes
|> Enum.chunk_by(fn arg -> arg end)
|> Enum.map(fn arg -> to_string(arg) end)
|> Enum.reduce("", fn(arg, acc) -> acc <> to_string(String.length(arg)) <> String.first(arg) end)

Now, explaining:

String.graphemes turns the string into a List containing every letter individually:

["A", "A", "A", "B", "B", "A", "A", "C"]

Enum.chunk_by(fn arg -> arg end) groups every duplicate letters into new lists:

[["A", "A", "A"], ["B", "B"], ["A", "A"], ["C"]]

Enum.map(fn arg -> to_string(arg) end) brings it together again:

["AAA", "BB", "AA", "C"]

Enum.reduce("", fn(arg, acc) -> acc <> to_string(String.length(arg)) <> String.first(arg) end) finally concatenates the count (String.length) and the first letter (String.first) to initiall accumulator (""):

"3A2B2A1C"


来源:https://stackoverflow.com/questions/36392742/group-or-count-duplicated-letters-in-elixir

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!