计算 Scala 中 List[List[T]] 中每个元素的出现次数

问题描述:

假设你有

val docs = List(List("one", "two"), List("two", "three"))

在哪里例如List("one", "two") 表示一个包含词条一"和二"的文档,并且您想用每个词条的文档频率构建一个地图,即在这种情况下

where e.g. List("one", "two") represents a document containing terms "one" and "two", and you want to build a map with the document frequency for every term, i.e. in this case

Map("one" -> 1, "two" -> 2, "three" -> 1)

你会如何在 Scala 中做到这一点?(以一种有效的方式,假设数据集更大.)

How would you do that in Scala? (And in an efficient way, assuming a much larger dataset.)

我第一个类似 Java 的想法是使用可变映射:

My first Java-like thought is to use a mutable map:

val freqs = mutable.Map.empty[String,Int]
for (doc <- docs)
  for (term <- doc)
    freqs(term) = freqs.getOrElse(term, 0) + 1

效果很好,但我想知道如何以更实用"的方式做到这一点,而不求助于可变映射?

which works well enough but I'm wondering how you could do that in a more "functional" way, without resorting to a mutable map?

docs.flatten.foldLeft(new Map.WithDefault(Map[String,Int](),Function.const(0))){
  (m,x) => m + (x -> (1 + m(x)))}

火车出事了!

啊,这样更好!

docs.flatten.foldLeft(Map[String,Int]() withDefaultValue 0){
  (m,x) => m + (x -> (1 + m(x)))}