计算 Scala 中 List[List[T]] 中每个元素的出现次数
问题描述:
假设你有
val docs = List(List("one", "two"), List("two", "three"))
在哪里例如List("one", "two") 表示一个包含词条一"和二"的文档,并且您想用每个词条的文档频率构建一个地图,即在这种情况下
where e.g. List("one", "two") represents a document containing terms "one" and "two", and you want to build a map with the document frequency for every term, i.e. in this case
Map("one" -> 1, "two" -> 2, "three" -> 1)
你会如何在 Scala 中做到这一点?(以一种有效的方式,假设数据集更大.)
How would you do that in Scala? (And in an efficient way, assuming a much larger dataset.)
我第一个类似 Java 的想法是使用可变映射:
My first Java-like thought is to use a mutable map:
val freqs = mutable.Map.empty[String,Int]
for (doc <- docs)
for (term <- doc)
freqs(term) = freqs.getOrElse(term, 0) + 1
效果很好,但我想知道如何以更实用"的方式做到这一点,而不求助于可变映射?
which works well enough but I'm wondering how you could do that in a more "functional" way, without resorting to a mutable map?
答
docs.flatten.foldLeft(new Map.WithDefault(Map[String,Int](),Function.const(0))){
(m,x) => m + (x -> (1 + m(x)))}
火车出事了!
啊,这样更好!
docs.flatten.foldLeft(Map[String,Int]() withDefaultValue 0){
(m,x) => m + (x -> (1 + m(x)))}