在 Julia 中使用匿名函数的性能损失

问题描述:

我注意到在 Julia 中使用匿名函数会导致性能下降.为了说明我有两种快速排序的实现(取自 Julia 发行版中的微性能基准测试).先升序排列

I have noticed that there is a performance penalty associated with using anonymous functions in Julia. To illustrate I have two implementations of quicksort (taken from the micro performance benchmarks in the Julia distribution). The first sorts in ascending order

function qsort!(a,lo,hi)
    i, j = lo, hi
    while i < hi
        pivot = a[(lo+hi)>>>1]
        while i <= j
            while a[i] < pivot; i += 1; end
            while pivot < a[j]; j -= 1; end
            if i <= j
                a[i], a[j] = a[j], a[i]
                i, j = i+1, j-1
            end
        end
        if lo < j; qsort!(a,lo,j); end
        lo, j = i, hi
    end
    return a
end

第二个需要一个额外的参数:一个匿名函数,可用于指定升序或降序排序,或对更奇特的类型进行比较

The second takes an additional parameter: an anonymous function that can be used to specify ascending or descending sort, or comparison for more exotic types

function qsort_generic!(a,lo,hi,op=(x,y)->x<y)
    i, j = lo, hi
    while i < hi
        pivot = a[(lo+hi)>>>1]
        while i <= j
            while op(a[i], pivot); i += 1; end
            while op(pivot, a[j]); j -= 1; end
            if i <= j
                a[i], a[j] = a[j], a[i]
                i, j = i+1, j-1
            end
        end
        if lo < j; qsort_generic!(a,lo,j,op); end
        lo, j = i, hi
    end
    return a
end

在对 Int64 数组进行排序时存在显着的性能损失,默认版本的速度要快一个数量级.以下是以秒为单位对长度为 N 的数组进行排序的时间:

There is a significant performance penalty when sorting Arrays of Int64, with the default version an order of magnitude faster. Here are times for sorting arrays of length N in seconds:

N       qsort_generic   qsort
2048    0.00125         0.00018
4096    0.00278         0.00029
8192    0.00615         0.00061
16384   0.01184         0.00119
32768   0.04482         0.00247
65536   0.07773         0.00490

问题是:这是由于编译器的限制会及时解决,还是有一种惯用的方法来传递应该在这种情况下使用的函子/匿名函数?

The question is: Is this due to limitations in the compiler that will be ironed out in time, or is there an idiomatic way to pass functors/anonymous functions that should be used in cases like this?

更新从答案看来,这将在编译器中修复.

update From the answers it looks like this is something that will be fixed up in the compiler.

同时,有两个建议的解决方法.这两种方法都相当简单,尽管它们确实开始让人感觉像是您必须在 C++ 中使用的那种拼图游戏(尽管不那么笨拙).

In the mean time, there were two suggested work arounds. Both approaches are fairly straightforward, though they do start to feel like the sort of jiggery-pokery that you have to use in C++ (though not on the same scale of awkward).

第一个是@Toivo Henningsson 建议的 FastAnon 包.我没有尝试这种方法,但看起来不错.

The first is the FastAnon package suggested by @Toivo Henningsson. I didn't try this approach, but it looks good.

我尝试了@simonstar 建议的第二种方法,它给了我与非通用 qsort 相同的性能!实施:

I tried out the second method suggested by @simonstar, which gave me equivalent performance to the non-generic qsort! implementation:

abstract OrderingOp
immutable AscendingOp<:OrderingOp end
immutable DescendingOp<:OrderingOp end
evaluate(::AscendingOp, x, y) = x<y
evaluate(::DescendingOp, x, y) = x>y

function qsort_generic!(a,lo,hi,op=AscendingOp())
    i, j = lo, hi
    while i < hi
        pivot = a[(lo+hi)>>>1]
        while i <= j
            while evaluate(op, a[i], pivot); i += 1; end
            while evaluate(op, pivot, a[j]); j -= 1; end
            if i <= j
                a[i], a[j] = a[j], a[i]
                i, j = i+1, j-1
            end
        end
        if lo < j; qsort_generic!(a,lo,j,op); end
        lo, j = i, hi
    end
    return a
end

感谢大家的帮助.

正如其他人所指出的,您编写的代码是惯用的 Julia,并且有朝一日会很快,但是编译器还没有完全到位.除了使用 FastAnonymous 之外,另一种选择是传递类型而不是匿名函数.对于此模式,您定义了一个不带字段的不可变对象和一个接受类型实例和一些参数的方法(我们称之为 evaluate).然后,您的排序函数将接受 op 对象而不是函数,并调用 evaluate(op, x, y) 而不是 op(x, y)代码>.因为函数专门针对它们的输入类型,所以抽象没有运行时开销.这是减少标准库中的排序顺序规范,以及NumericExtensions.

As others have noted, the code you've written is idiomatic Julia and will someday be fast, but the compiler isn't quite there yet. Besides using FastAnonymous, another option is to pass types instead of anonymous functions. For this pattern, you define an immutable with no fields and a method (let's call it evaluate) that accepts an instance of the type and some arguments. Your sorting function would then accept an op object instead of a function and call evaluate(op, x, y) instead of op(x, y). Because functions are specialized on their input types, there is no runtime overhead to the abstraction. This is the basis for reductions and specification of sort order in the standard library, as well as NumericExtensions.

例如:

immutable AscendingSort; end
evaluate(::AscendingSort, x, y) = x < y

function qsort_generic!(a,lo,hi,op=AscendingSort())
    i, j = lo, hi
    while i < hi
        pivot = a[(lo+hi)>>>1]
        while i <= j
            while evaluate(op, a[i], pivot); i += 1; end
            while evaluate(op, pivot, a[j]); j -= 1; end
            if i <= j
                a[i], a[j] = a[j], a[i]
                i, j = i+1, j-1
            end
        end
        if lo < j; qsort_generic!(a,lo,j,op); end
        lo, j = i, hi
    end
    return a
end