巴塞尔问题（Hello, World!）

2023年7月28日

你好，世界！这是我的第一篇文章，专门用来测试这个网站的功能。

以下是几种编程语言中计算巴塞尔问题的代码片段：

首先， $L A T E X$

n = 1 \sum \infty \frac{1}{n ^{2}} = \frac{π ^{2}}{6} = 1.6449340668482264

Python

def pi_squared_over_6(N: int) -> float:
    return sum(x**(-2) for x in range(1,N))

Rust

fn pi_squared_over_6(N: u64) -> f64 {
    (1..N).map(|x| 1.0 / ((x*x) as f64)).sum()
}

Haskell

piSquaredOver6 :: Integer -> Double
-- no capital N in Haskell :(
piSquaredOver6 n = sum $ map (\x -> 1 / fromIntegral (x * x)) [1..n]

C

double pi_squared_over_6(unsigned int N) {
    double sum = 0.0;
    for (int i = 1; i < N; i++) {
        sum += 1.0 / (i*i);
    }
    return sum;
}

你最喜欢哪种解法？

性能对比

让我们看看它们在 M1 Pro 芯片上处理 $N = 1 0^{9}$ 时的性能表现。

语言	耗时（毫秒， $μ \pm σ$ ）
Rust (并行版本)	$112.6 \pm 3.5$
Rust (–release)	$937.9 \pm 0.4$
C (-O3)	$995.3 \pm 0.8$
Haskell (-O3)	$13454 \pm 205$
Python (3.10)	$67720 \pm 0$

优化 Python 实现

Python 代码的运行时间长得离谱，因此我们利用 numpy 调用向量化 C 代码来优化：

import numpy as np

def pi_squared_over_6(N: int) -> float:
    x = np.ones(N)
    r = np.arange(1,N)
    sq = np.square(r)
    div = np.divide(x, sq)
    return float(np.sum(div))

有所改善，但当我查看 btm 时，过高的内存消耗表明大部分工作是在移动数十亿个浮点数，而非实际计算。尝试分块处理：

def pi_squared_over_6(N: int) -> float:
    CHUNKS = 25000
    SIZE = N // CHUNKS
    s = 0.0
    x = np.ones(N // CHUNKS - 1)
    for i in range(CHUNKS):
        N_tmp = i * SIZE
        r = np.arange(N_tmp + 1, N_tmp + SIZE)
        sq = np.square(r)
        div = np.divide(x, sq)
        s += np.sum(div)
        # 释放内存
        del sq
        del div
        del r
        
    return s

好多了！现在运行时间已缩短至 2 秒以内！

✦ No LLMs were used in the ideation, research, writing, or editing of this article.