№ 1 (1979) · pp.195–203

A PROOF THAT EULER MISSED...
Apéry's Proof of the Irrationality of ζ(3)
An Informal Report of Alfred van der Poorten
1. Journées Arithmétiques de Marseille–Luminy, June, 1978
The board of programme changes informed us that R. Apéry (Caen) would speak Thursday, 14:00 «Sur l'irrationalite de ζ(3)». Though there had been earlier rumours of his claiming a proof, scepticism was general. The lecture tended to strengthen this view to rank disbelief. Those who listened casually, or who were afflicted with being nonFrancophone, appeared to hear only a sequence of unlikely assertions.
EXERCISE
Prove the following amazing claims:
1 
For all a_{1}, a_{2}, ...
∞  
∑ 
a_{1}a_{2} ... a_{k–1} (x + a_{1})(x + a_{2}) ... (x + a_{k}) 
= 
1 x 
. 
k=1  

2 
 ∞   ∞  
ζ(3) = 
∑ 
1 n^{3} 
= 
5 2 
∑ 

. 
 n=1   n=1  

(1) 

3 
Consider the recursion:
n^{3}u_{n} + (n – 1)^{3}u_{n–2} = (34n^{3} – 51n^{2} + 27n – 5)u_{n–1}, n ≥ 2. 
(2) 
Let {b_{n}} be the sequence defined by b_{0} = 1, b_{1} = 5, and b_{n} = u_{n} for all n; then the b_{n} are integers! Let {a_{n}} be the sequence defined by a_{0} = 0, a_{1} = 6, a_{n} = u_{n} for all n; then the a_{n} are rational numbers with denominator dividing 2[1, 2, ..., n]^{3} (here [1, 2, ..., n] is the LCM (lowest common multiple) of 1, 2, ..., n).

4 
a_{n}/b_{n} → ζ(3); indeed the convergence is so fast as to prove that ζ(3) cannot be rational. To be precise, for all integers p, q with q sufficiently large relative to ε > 0
ζ(3) – 
p q 
> 
1 q^{12.417820... + ε} 
. 

Moreover, analogous claims were made for ζ(2):
2' 
 ∞   ∞  
ζ(2) = 
∑ 
1 n^{2} 
= 
π^{2} 6 
= 3 
∑ 

. 
 n=1   n=1  

(3) 

3' 
Consider the recursion:
n^{2}u_{n} + (n – 1)^{2}u_{n–2} = (11n^{2} – 11n + 3)u_{n–1}, n ≥ 2. 
(4) 
Let {B_{n}} be the sequence defined by B_{0} = 1, B_{1} = 3, and B_{n} = u_{n} for all n; then the B_{n} all are integers! Let {A_{n}} be the sequence defined by A_{0} = 0, A_{1} = 5, a_{n} = u_{n} for all n; then the A_{n} are rational numbers with denominator dividing [1, 2, ..., n]^{2}.

4' 
A_{n}/B_{n} → ζ(2); indeed the convergence is so fast as to prove that ζ(2) cannot be rational. To be precise, for all integers p, q with q sufficiently large relative to ε > 0
π^{2} – 
p q 
> 
1 q^{11.850782... + ε} 
. 

I heard with some ineredulity that, for one, Henri Cohen (Grenoble) believed that these claims might well be valid. Very much intrigued, I joined Hendrik Lenstra (Amsterdam) and Cohen in an evening's discussion, in which Cohen explained and demonstrated most of the details of the proof. We came away convinced that Professor Apery had indeed found a quite miraculous and magnificent demonstration of the irrationality of ζ(2). But we remained unable to prove a critical step.
2. For the Nonexpert Reader
A number β is irrational if it is not of the form p_{0}/q_{0}; p_{0}, q_{0} integers (ÎZ). A rational number b is characterised by the property that for p, qÎZ (q>0) and b ≠ p/q there exists an integer q_{0} (>0, of course) such that b – p/q ≥ 1/qq_{0}. On the other hand, for irrational β there are always infinitely many p/q (for instance, the convergents of the continued fraction expansion of β) such that β – p/q < 1/q^{2}. Plainly this yields a criterion for irrationality. It there is a δ>0 and a sequence {p_{n}/q_{n}} of rational numbers such that p_{n}/q_{n} ≠ β and β – p_{n}/q_{n} < 1/q_{n}^{1+δ}, nÎN, then β is irrational.
A successful application of the criterion may yield a measure of irrationality: If β – p_{n}/q_{n} < 1/q_{n}^{1+δ}, and the q_{n} are monotonic increasing with q_{n} < q_{n–1}^{1+k} (for n sufficiently large relative to k > 0), then for all integers p, q > 0 sufficiently large relative to ε > 0),
β – 
p q 
< 
1 q^{(1 + δ)/(δ – k) + ε} 
. 
For example, if the sequence {q_{n}} increases geometrically we may take k > 0 arbitrary small so that 1 + ^{1}/δ becomes an irrationality degree for β. To see the claim suppose that β – p/q ≤ 1/q^{σ} and select n so that q_{n–1}^{1+δ} ≤ q^{σ} ≤ q_{n}^{1+δ}. Then
1 qq_{n} 
≤ 
p q 
– 
p_{n} q_{n} 
≤ 
β – 
p_{n} q_{n} 
+ 
β – 
p q 
≤ 
1 q_{n}^{1 + δ} 
+ 
1 q^{σ} 
< 
2 q^{σ} 
. 
Hence ½q^{σ} ≤ qq_{n} < qq_{n–1}^{k+1} < qq_{n} < q^{1+σ(1+k)/(1+δ)} or σ < (1+δ)/(1+k) + ε as claimed. This argument is effective (the "sufficiently large" requirements can be made explicit).
It is wellknown (the theorem Thue–Siegel–Roth) that for β algebraic (a zero of a polynomial a_{0}X^{n} + a_{1}X^{n–1} + ... + a_{n}, a_{i}ÎZ) always β – p/q < 1/q^{2+ε}, for q sufficiently large relative ε > 0. So, if β is too well approximable by rationals (δ > 1 above) then β is not algebraic, but transcendental. Unfortunately, only a set of measure zero of transcendental numbers can be detected in this way, whilst since the set of algebraic numbers is countable, almost all numbers are transcendental. It is notoriously difficult to prove that any given naturally occuring number is irrational, let alone transcendental. One may be fortunate: for example the usual series for e implies immediately (easy exercise) that e is irrational. In the case of the Riemann ζfunction:
 ∞  
ζ(s) = 
∑ 
1 n^{s} 

(Re s > 1) 
 n=1  
there is the quite wellknown fact that
 ∞  
ζ(2k) = 
∑ 
1 n^{2k} 
= 
(–1)^{k–1}(2π)^{2k} 2(2k)! 
B_{2k} 
, 
kÎN, 
 n=1  

(5) 
where the Bernoulli numbers, B_{m}, are rational (ζ(2) = π^{2}/6, ζ(4) = π^{4}/90, ζ(6) = π^{6}/945, ...). There are some classical techniques (see [1]) for detecting the irrationality of powers of π, but it is most useful to appeal to the theorem of Hermite–Lindemann (whereby e^{α} is transcendental for algebraic α ≠ 0) whence π is transcendental (because e^{πi} = –1) and so a fortiori its powers are irrational, kÎN. On the other hand there are no useful analogous closed evaluations of ζ at odd arguments. There is however a famous formula of Ramanujan: let α and β be positive numbers such that αβ = π^{2}. Then if n is any positive integer

( 

∞ 

) 

1 α^{n} 
ζ(2n+1) 2 
+ 
∑ 
1 k^{2n+1}(e^{2αk} – 1) 
= 


k=1 


( 

∞ 

) 

n+1 

= 
(–1)^{n} β^{n} 
ζ(2n+1) 2 
+ 
∑ 
1 k^{2n+1}(e^{2βk} – 1) 
– 2^{2n} 
∑ 
(–1)^{k} 
B_{2k} (2k)! 
B_{2n+2–2k} (2n+2–2k)! 
α^{n+1–k}β^{k}. 


k=1 


k=0 

Taking α rational multiple of π one sees that ζ(2n+1) is given as a rational multiple of π^{2n+1} plus two very rapidly convergent series. See for example [2]. Indeed the above formula is the natural analogue of Euler's formula (5). The cited paper gives many others formulas and detailed references). Incidentally, (5) is demonstrated quite easily. The Bernoulli numbers are defined by the generating function (a nontrivial example of an even function!)

∞ 

z e^{z} – 1 
+ 
z 2 
= 
∑ 
B_{2m} (2m)! 
z^{2m}, 

m=0 

hence by the recursion
( 
n 0 
) 
B_{0} + 
( 
n 1 
) 
B_{1} + ... + 
( 
n n–1 
) 
B_{n–1} = 0, 

B_{0} = 1, B_{1} = – 
1 2 
, 
n = 3, 4, ... . 
On the other hand it is wellknown that
 ∞  
sin πz = πz 
∏ 
( 
1 – 
z^{2} n^{2} 
) 
, 
 n=1  
so
 ∞  
π 
sin πz cos πz 
= π ctg πz = 
1 z 
– 
∑ 
2z n^{2} – z^{2} 
. 
 n=1  
But
 ∞  
π ctg πz = πiz 
e^{πiz} + e^{–πiz} e^{πiz} – e^{–πiz} 
= 
2πiz e^{2πiz} – 1 
+ πiz = 
∑ 
(–1)^{m} 
(2π)^{2m} (2m)! 
B_{2m}z^{2m}, 
 m=0  
and on the other hand
 ∞   ∞   ∞  
π ctg πz = 1 – 2 
∑ 

∑ 
z^{2m} n^{2m} 
= 1 – 2 
∑ 
ζ(2m)z^{2m}. 
 m=1   n=1   m=1  
Comparing coefficients one has (5). With a little ingenuity one can avoid a direct appeal to the infinite product for sin πz or to the expansion for π ctg πz (For a detailed set of references, and some new proofs, see [3]). Indeed proving the irrationality of ζ(2n+1), nÎN constitutes one of the outstanding problems of the theory (ranking with the arithmetic nature of
 n  
γ = 
lim 
( 
∑ 
1 k 
– ln n 
), 
 n→∞   k=1  
and of eπ, e + π, ... which are yet undetermined).
It is some measure of Apery's achievement that these questions have been considered by mathematicians of the top rank over the past few centuries without much success being achieved.
3. Some Irrelevant Explanations
For much of the following details I am indebted to Henri Cohen. All this due to Apery, of course. The identity
K  
∑ 
a_{1}a_{2} ... a_{k–1} (x + a_{1})(x + a_{2}) ... (x + a_{k}) 
= 
1 x 
– 
a_{1}a_{2} ... a_{K} x(x + a_{1})(x + a_{2}) ... (x + a_{K}) 
k=1  
follows easily on writing the righthand side as A_{0} – A_{K} and noting that each term on the left is A_{k–1} – A_{k }. This explains 1 . Now put x = n^{2}, a_{k} = –k^{2}, and take k ≤ K ≤ n – 1, to obtain
n–1  
∑ 
(–1)^{k–1}(k – 1)!^{2} (n^{2} – 1^{2}) ... (n^{2} – k^{2}) 
= 
1 n^{2} 
– 
(–1)^{n–1}(n – 1)!^{2} n^{2}(n^{2} – 1^{2}) ... (n^{2} – (n–1)^{2}) 
= 
1 n^{2} 
– 
2(–1)^{n–1} 
n^{2} 
( 
2n n 
) 

. 
k=1  
Writing
ε_{n,k} = 
k!^{2}(n – k)! k^{3}(n + k)! 
because
(–1)^{k} n (ε_{n,k} – ε_{n–1,k} ) = 
(–1)^{k – 1}(k – 1)!^{2} (n^{2} – 1^{2}) ... (n^{2} – k^{2}) 
we have
N 
n–1 

N 

N 

∑ 
∑ 
(–1)^{k} (ε_{n,k} – ε_{n–1,k} ) = 
∑ 
1 n^{3} 
– 2 
∑ 

= 
n=1 
k=1 

n=1 

n=1 


N 

N 

N 

= 
∑ 
(–1)^{k} (ε_{N,k} – ε_{k,k} ) = 
1 2 
∑ 
(–1)^{k} 
k^{3} 
( 
N+k k 
)( 
N k 
) 

– 
1 2 
∑ 


k=1 

k=1 

k=1 

and on noting that as N → ∞ the first term on the right vanishes, we have 2 . Actually the formula 2 is quite well known: it was observed some years ago by Raymond Ayoub (Penn.State) and it in fact appears in [4]; independently again it was noticed by R. William Gosper, Jr. (Palo Alto) in [5]. Henri Cohen remarked that the formula is
ζ(3) = 
5 4 


Li_{3} 
( 
1 τ^{2} 
) 
+ 
2π^{2} 15 
ln τ – 
2 3 
ln^{3} τ 

, 
where τ = ½(1 + √5) and Li_{3}(x) = ∑ x^{n}/n^{3} is the trilogarithm. Hjortnaes, Ayoub, and respectively Gosper note the integral representations (easily shown equivalent)
 ln τ   ½  
ζ(3) = 10 
ò 
t^{2} cth t dt = 10 
ò 
arsh^{2} t t 
dt. 
 0   0  
In the case ζ(2) the formula is even better known. It is, for example, referred to by Z. R. Melzak [6], but suggested proof is not quite appropriate. 2' may be proved by slightly varying the argument in Section 3 – multiply by (–1)^{n–1} instead of dividing by n. Many formulas similar to 2 and 2' appear in the literature and the folklore.
4. Some Nearly Relevant Explanations
All this is quite irrelevant to the proof. It would suffice to introduce the quantities
 n   k  
c_{n,k} = 
∑ 
1 m^{3} 
+ 
∑ 
(–1)^{m–1} 
2m^{3} 
( 
n m 
)( 
n+m m 
) 

, k ≤ n 
 m=1   m=1  

(6) 
and to remark that plainly c_{n,k} → ζ(3) as n → ∞ uniformly in k. One might hope that a sequence c_{n,k} already implies the irrationality of ζ(3) (say, the diagonal, with k = n) but this is not quite so. To see this, it is useful to prove a lemma:
Lemma:
2c_{n,k} 
( 
n + k k 
) 
Î Z + 
Z 2^{3} 
+ ... + 
Z n^{3} 
= 
Z [1, 2, ..., n]^{3} 
. 
Equivalently: 2[1, 2, ..., n]^{3}c_{n,k} 
( 
n + k k 
) 
is an integer. 
Proof. We check the number of times that any given prime p divides the denominator. But
( 
n + m m 
) 
= 

( 
n + k k 
) 
( 
n + k k – m 
) 
and
ord_{p} 
( 
n m 
) 
≤ 

ln n ln p 

– ord_{p} m = ord_{p} [1, ..., n] – ord_{p} m, 
so, we have
ord_{ p} 
ì 
m^{3} 
( 
n m 
)( 
n + m m 
) 
ü 
= ord_{ p} 
ì 
m^{3} 
( 
n m 
)( 
k m 
) 
ü 
≤ 
î 

þ 
î 

þ 
≤ 3 ord_{p} m + 

ln n ln p 

+ 

ln k ln p 

– 2 ord_{p} m, 
which yields the assertion, because m ≤ k ≤ n. We remark that those who know it well (Those who know it really well write
 ∞  
ln [1, ..., n] = 
∑ 
θ(n^{1/m}) = ψ(n), where θ(n) = 
∑ 
ln p. 
 m=1   p≤n  
Then it is known that ψ(n)/n ≤ 1.03883... (with maximum at n = 113) and indeed ψ(n) – n < 0.0242334...·n/ln n for n ≥ 525.752; see [7]) know that for n sufficiently large relative to ε > 0,
[1, 2, ..., n] ≤ e^{n(1+ε)}
(roughly: [1, ..., n] = 
∏ 
p^{[ln n/ln p]} ≤ 
∏ 
n ~ n^{n/ln n} = e^{n}). 
 p≤n   p≤n  
It will turn out that the c_{n,k} have too large a denominator relative to their closeness to ζ(3). Hence to apply the irrationality criterion we must somehow accelerate the convergence. Apery described this process as follows: Consider two trianglular arrays (defined for k ≤ n) with entries
d 
(0) n,k 
= c_{n,k} 
( 
n+k k 
) 
and 
( 
n+k k 
) 
respectively. We recall that the arrays have the property that their "quotient" converges to ζ(3), in the sence that given any "diagonal" {n, k(n)}, the quotient of the corresponding elements of the two arrays converges to ζ(3). Now apply the following transformations to each array:
d 
(0) n, k 
→ d 
(0) n, n–k 
= d 
(1) n, k 
→ 
( 
n k 
) 
d 
(1) n, k 
= d 
(2) n, k 
→ 

k 

k 

→ 
∑ 
( 
k m 
) 
d 
(2) n, k 
= d 
(3) n, k 
→ 
( 
n k 
) 
d 
(3) n, k 
= d 
(4) n, k 
→ 
∑ 
( 
k m 
) 
d 
(4) n, k 
= d 
(5) n, k 
, 

m=0 

m=0 


k 

( 
n+k k 
) 
→ 
( 
2n–k n 
) 
→ 
( 
n k 
)( 
2n–k n 
) 
→ 
∑ 
( 
k m 
)( 
n m 
)( 
2n–m n 
) 
→ 

m=0 


k 

k 
l 

→ 
∑ 
( 
k m 
)( 
n m 
)( 
n k 
)( 
2n–m n 
) 
→ 
∑ 
∑ 
( 
k l 
)( 
l m 
)( 
n l 
)( 
n m 
)( 
2n–m n 
) 
. 

m=0 

l=0 
m=0 

Of course, the arrays have retained the property that their "quotient" converges to ζ(3), and we still have 2[1, 2, ..., n]^{3}d_{n,k}ÎZ: We now take the main diagonals (k = n) of the arrays, calling them respectively {a_{n}} and {b_{n}} and make the fantastic assertions embodied in 3 ! That is, each sequence satisfies the recurrence (2)! This is plainly absurd since surely inter alia a solution {u_{n}} of (2) (with integral initial values u_{0}, u_{1}) will have {u_{n}} with denominator more like n!^{3} than like 1 (or even 2[1, 2, ..., n]^{3}). In Marseille, our amazement was total when our HP67s, calculating {b_{n}} on the one hand from the definition above, and on the other hand by the recurrence (2), kept on producing the same values.
5. It Seems that Apery Has Shown that ζ(3) Is Irrational
We were quite unable to prove that the sequences {a_{n}} defined above did satisfy the recurrence (2) (Apery rather tartly pointed out to me in Helsinki that he regarded this more a compliment than a criticizm of his method). But empirically (numerically) the evidence in favour was utterly compelling. It seemed indeed that ζ(3) had been proved irrational, because the rest, thus 4 , follows quite easily: Given (with p(n – 1) = 34n^{3} – 51n^{2} + 27n – 5),
n^{3}a_{n} – p(n – 1)a_{n–1} + (n – 1)^{3}a_{n–2} = 0,
n^{3}b_{n} – p(n – 1)b_{n–1} + (n – 1)^{3}b_{n–2} = 0,
one multiplies the first equation by b_{n–1}, the second by a_{n–1}, to obtain
n^{3}(a_{n}b_{n–1} – a_{n–1}b_{n}) = (n – 1)^{3}(a_{n–1}b_{n–2} – a_{n–2}b_{n–1}).
Recalling a_{1}b_{0} – a_{0}b_{1} = 6·1 – 0·5 = 6 this cleverly yields
a_{n}b_{n–1} – a_{n–1}b_{n} = 6/n^{3}. 
(7) 
Seeing that ζ(3) – a_{0}/b_{0} = ζ(3), it is easily indiced (write ζ(3) – a_{n}/b_{n} = x_{n}, and note that we have x_{n} – x_{n–1} = –6/n^{3}b_{n}b_{n–1} and x_{∞} = 0) that

∞  
ζ(3) – 
a_{n} b_{n} 

= 
∑ 
6 k^{3}b_{k }b_{k –1} 
, ζ(3) – 
a_{n} b_{n} 
= O 
( 
1 b_{n}^{2} 
) 
. 

k=n+1  
On the other hand the recurrence relation makes it easy to estimate b_{n}, at any rate asymptotically. We have
b_{n} – 
ì 
34 –_{ } 
51 n^{ } 
+ 
27 n^{2} 
–_{ } 
5 n^{3} 
ü 
b_{n–1} + 
ì 
1 –_{ } 
3 n^{ } 
+ 
3 n^{2} 
–_{ } 
1 n^{3} 
ü 
b_{n–2} = 0 
î 
þ 
î 
þ 
and since the polynomial x^{2} – 34x + 1 has zeros 17 ± 12√2 = (√2 ± 1)^{4} we readily conclude that b_{n }= O(α^{4n}), α = 1 + √2. In fact Cohen has, more precisely, calculated that
b_{n }= 
(1 + √2)^{2} (2π√2)^{3/2} 

(1 + √2)^{4n} n^{3/2} 
ì 
1 –_{ } 
48 – 15√2 64n 
+ O 
ì 
1 n^{2} 
ü 
ü 
. 
î 
î 
þ 
þ 

(8) 
We have to recall that the a_{n} are not integers. But writing p_{n }= 2[1, 2, ..., n]^{3}a_{n}, q_{n }= 2[1, 2, ..., n]^{3}b_{n} we have p_{n}, q_{n }ÎZ and q_{n }=O(α^{4n}e^{3n}),
ζ(3) – 
p_{n} q_{n} 
= O 
ì 
1 α^{8n} 
ü 
= O 
ì 
1 q_{n}^{1+δ} 
ü 
, with δ = 
4 ln α – 3 4 ln α + 3 
= 0.080529... > 0. 
î 
þ 
î 
þ 
Hence, by the irrationality criterion, ζ(3) is indeed irrational, and moreover, because 1/δ = 12.417820... we have: For all integers p, q > 0 sufficiently large relative to ε > 0
ζ(3) – 
p q 
> 
1 q^{12.417820... + ε} 
. 
6. Some Trivial Verifications
To convince ourselves of the validity of Apery's proof we need only complete the following exercise.
EXERCISE
Prove the following identities:
5 
Let c_{n, k} defined by (6) and

n 

n 

a_{n }= 
∑ 
( 
n k 
) 
^{2} 
( 
n+k k 
) 
^{2} 
c_{n, k }, b_{n} = 
∑ 
( 
n k 
) 
^{2} 
( 
n+k k 
) 
^{2} 
. 

k=0 

k=0 

Then a_{0} = 0, a_{1} = 6; b_{0} = 1, b_{1} = 5 and each sequence {a_{n}} and {b_{n}} satisfies the recurrence (2).

In the same spirit, the case of ζ(2) requires:
5' 
Let

n 

k 

C_{n, k }= 2 
∑ 
(–1)^{m–1} m^{2} 
+ 
∑ 
(–1)^{n+m–1} 
m^{2} 
( 
n m 
)( 
n+m m 
) 

, 

m=1 

m=1 


n 

n 

A_{n }= 
∑ 
( 
n k 
) 
^{2} 
( 
n+k k 
) 
C_{n, k }, B_{n} = 
∑ 
( 
n k 
) 
^{2} 
( 
n+k k 
) 
. 

k=0 

k=0 

Then A_{0} = 0, A_{1} = 5; B_{0} = 1, B_{1} = 3 and each sequence {A_{n}} and {B_{n}} satisfies the recurrence (4).

It is useful to notice that very little more than just proving these claims is required for Apery's proof. After all, it is quite plain that a_{n}/b_{n} → ζ(3); the b_{n} are integers, and the lemma of Section 4 shows that the a_{n} are "nearintegers". In Section 5 we showed that the sequence satisfy the recursion (2) the irrationality of ζ(3) follows because from lnα>3 we obtain δ>0. Thus, as implied in various asides, most of the earlier argument is quite irrelevant. Indeed I am indebted to John Conway for the remark that even 5 is irrelevant.
EXERCISE
Be the first in your block to prove by a 2line argument that ζ(3) is irrational (The author does not pretend to be able to do this. Notice that in fact even less is needed: it is sufficient to show a_{n}b_{n–1} – a_{n–1}b_{n} = O(γ^{n}) and b_{n} = O(β^{n}), with ln β – ln γ > 3).
6 
Given the definitions of 5 show that a_{n}b_{n–1} – a_{n–1}b_{n} = 1/b_{n}^{3} and b_{n} = O(α^{4n}) with α = 1 + √2. Conclude that ζ(3) is irrational because lnα>¾. 
EXERCISE
Astound your friends with an excellent irrationality measure for π^{2}.
6' 
Given the definitions of 5' show that a_{n}b_{n–1} – a_{n–1}b_{n} = 5(–1)^{n–1}/n^{2} and b_{n} = O(ω^{5n}) with ω = ½(1 + √5). Conclude that for all integers p, q > 0 sufficiently large relative to ε>0
π^{2} – 
p q 
> 
1 q^{11.850782... + ε} 
. 

Though we have long known that ζ(2) is irrational, Apery's result in this case is significant. The irrationality degree for π^{2} is the best known; the irrationality degree implied for π is 23.701564... . These results compare very favourably with those of Mahler [8]: π – p/q > q^{–42}.
Wirsing announced π – p/q > q^{–21} and Mignotte proved that (for q sufficiently large) π – p/q > q^{–20}; this is the best known result. It should be noted that the cited results depend on deep techniques and complicated estimates in transcendence theory as contrasted with the essentially elementary methods in Apery's proof. Mignotte also shows that π – p/q > q^{–18}, which is weaker than Apery's result.
7. ICM'78. Helsinki, August 1978
Neither Cohen nor I had been able to prove 5 or 5' in the intervening 2 months. After a few days of fruitless effort the specific problem was mentioned to Don Zagier (Bonn), and with irritating speed he showed that indeed the sequence {B_{n}} satisfies the recurrence (4). This more or less broke the dam and 5 and 5' were quickly conquered. Henri Cohen addressed a very wellattended meeting at 17:00 on Friday, August 18 in the language of the majority, proving 5 and explaining how this implied the irrationality of ζ(3). Apery then made some remarks on the status of the French language, and alluded to the underlying motivation (as mentioned in Section 3) for his astonishing proof.
EXERCISE
Show that
7 
ζ(3) = 
6 

5 – 
1 

117 – 
64 

535 – ... – 
n^{6} 34n^{3} + 51n^{2} + 27n + 5 
and deduce that ζ(3) = 1.202056903... is irrational.

7' 
ζ(2) = 
π^{2} 6 
= 
5 
3 + 
1 

25 + 
16 

69 + ... + 
n^{4} 11n^{2} + 11n + 3 
and deduce that π^{2} has irrationality degree at most 11.850782... .

8. Some Rather Complicated but Ingenious Explanations
According to a dictum of Littlewood any identity, once verified, is trivial. Surely 5 is very nearly a counterexample. The following is principally due to Zagier and Cohen. Incidentally, we first considered 5' which appeared simpler, but this was because we had failed to notice that
n 
k 

n 

∑ 
∑ 
( 
n k 
) 
^{2} 
( 
n l 
)( 
k l 
)( 
2n – l n 
) 
= 
∑ 
( 
n k 
) 
^{2} 
( 
2n – k n 
) 
^{2} 
. 
k=0 
l=0 

k=0 

Now writing n – k for k links the arrays of Section 4 to 5 . It is quite convenient to write:

n 

n 

b_{n, k }= 
( 
n k 
) 
^{2} 
( 
n + k k 
) 
^{2} 
, a_{n, k }= b_{n, k }c_{n, k } 

b_{n} = 
∑ 
b_{n, k }, a_{n} = 
∑ 
b_{n, k }c_{n, k } 

. 

k=0 

k=0 

Then we wish to show that
∑ 
( 
(n + 1)^{3}b_{n+1, k } – (34n^{3} + 51n^{2} + 27n + 5)b_{n, k } + n^{3}b_{n–1, k } 
) 
= 0. 
k 

We cleverly construct
B_{n, k }= 4(2n + 1) 
( 
k(2k + 1)_{ }– (2n + 1)^{2} 
) 
( 
n k 
) 
^{2} 
( 
n + k k 
) 
^{2} 
with the motive that
B_{n, k } – B_{n, k–1} = (n + 1)^{3} 
( 
n + 1 k 
) 
^{2} 
( 
n + 1 + k k 
) 
^{2} 
– 
– (34n^{3} + 51n^{2} + 27n + 5) 
( 
n k 
) 
^{2} 
( 
n + k k 
) 
^{2} 
+ n^{3} 
( 
n – 1 k 
) 
^{2} 
( 
n – 1 + k k 
) 
^{2} 
, 
and, O mirabile dictu, the sequence {b_{n}} does indeed satisfy the recurrence (2) by virtue of the method of creative telescoping (by the usual conventions: B_{n, k} = 0 for k < 0 or k > n; note also that P(n) = 34n^{3} + 51n^{2} + 27n + 5 implies P(n–1) = –P(–n)). The rest is plain sailing (or is it plane sailing?) We notice that
(n + 1)^{3}b_{n+1, k }c_{n+1, k} – P(n)b_{n, k }c_{n, k} + n^{3}b_{n–1, k }c_{n–1, k} = 
= (B_{n, k} – B_{n, k–1})c_{n, k} + (n + 1)^{3}b_{n+1, k }(c_{n+1, k} – c_{n, k}) – n^{3}b_{n–1, k}(c_{n, k} – c_{n–1, k}). 

(9) 
Clearly

k 

c_{n, k} – c_{n–1, k} = 
1 n^{3} 
+ 
∑ 
(–1)^{m }(m – 1)!^{2 }(n – m – 1)! (n + m)! 
= 

m=1 


k 

= 
1 n^{3} 
+ 
∑ 
( 
(–1)^{m }m!^{2 }(n – m – k)! n^{2 }(n + m)! 
– 
(–1)^{m–1 }(m – 1)! n^{2 }(n + m + 1)! 
) 
= 
(–1)^{k }k!^{2 }(n – k – 1)! k^{2 }(n + k)! 

m=1 

whilst not even a minor miracle is required to write down c_{n,k} – c_{n, k–1}. After some massive reorganization (9) becomes A_{n, k} – A_{n, k–1} with
A_{n, k} = B_{n, k}c_{n, k} + 
5(2n + 1)(–1)^{k–1 }k n(n + 1) 
( 
n k 
)( 
n + k k 
) 
and we have completed 5 , and, in passing, proved 3 . This of course verifies Apery's claim to have proved ζ(3) irrational.
9. The Case of ζ(2)
The arguments required to deal with the exercises 2' – 6' are quite similar to those already described. It way however be a kindness to the reader to reveal that it would be wise to take
B_{n, k} = (k^{2} + 3(2n+1)k – 11n^{2} – 9n – 2) 
( 
n k 
) 
2 
( 
n + k k 
) 
, 
A_{n, k} = B_{n, k}C_{n, k} + 3(–1)^{n+k–1} 
(n – 1)! (k – 1)! 
. 
Moreover
C_{n, k} – C_{n–1, k} = 2(–1)^{n+k–1} 
k!^{2}(n – k – 1)! n(n + k)! 
and 
B_{n} = 
[½(1 + √5)]^{5n+4} ^{ }2πn√5 + 2√5 
( 
1 + O(^{ 1}/n) 
) 
. 

(10) 
(also note that if Q(n) = 11n^{2} + 11n + 3 then Q(n–1) = Q(–n)).
10. What on Earth is Going on Here?
Apery's incredible proof appears to be a mixture of miracles and mysteries. The dominating question is how to generalize all this, down to the Euler constant γ and up to the general ζ(t)? Here we have, apparently, the tip of an iceberg which relates 1 + √2 to ζ(3) and ½(1 + √5) to ζ(2); we have surprising identities 2 and 2' , and startling continued fractions (produced by Cohen for his Helsinki talk), 7 and 7' . Does the complete berg look like this? For my part I incline to the view that much of what has been presented constitutes a mystification rather than an explanation. For example Richard Askey (Madison, Wiskonsin) has pointed out to me that the sequences {b_{n}} and {B_{n}} may be recognized as special values of certain hypergeometric polynomials; immediately the recurrences 2 and 2' become identities relating hypergeometric functions and much of the magic fades away. Unfortunately the difficulties remain, because not all that much is known about the higher generalizations of the classical hypergeometric functions. For this, and other reasons, it is however likely that one should think about recurrences of order greater than 2. This, incidentally, means that the continued fractions constitute a red herring. In any event 7 obscures a fundamental miracle. It convergents P_{n}/Q_{n} are of course such that the sequences {P_{n}} and {Q_{n}} both satisfy U_{n+1} = (34n^{3} + 51n^{2} + 27n + 5)U_{n} – n^{6}U_{n–1}. The proof works (not because the continued fraction does not terminate; that only works for regular continued fractions, but) because if U_{0} = 1, U_{1} = 5 then it happens that n!^{3} divides the integers U_{n}; more honestly: it is already enough (and is necessary) that for any initial integer values U_{0}, U_{1}, n!^{3} always divides 2[1, 2, ..., n]^{3}U_{n}. An analogous miracle makes the recurrence U_{n+1} = (11n^{2} + 11n + 3)U_{n} + n^{4}U_{n–1} useful in proving the irrationality of ζ(2). Tom Cusick (Buffalo) has noticed that the following recurrences also yield continued fractions converging to π^{2}/6:
n^{2}u_{n} = (7n^{2} – 7n + 2)u_{n–1} + 8(n–1)^{2}u_{n–2}
(one solution of which is 

), 
and n^{3}u_{n} = 2(2n – 1)(3n^{2} – 3n + 1)u_{n–1} + (4n – 3)(4n – 4)(4n – 5)u_{n–2}
On first impression the first yields a worse irrationality degree for π^{2} than that obtained by Apery, and the second does not yield irrationality at all. Apery's results are indeed remarkable. These surprises generalize the following quite well known fact (to which I was alerted by Frits Beukers (Leiden)): the recurrence U_{n+1} = (6n + 3)U_{n} – n^{2}U_{n–1} is such that n! divides U_{n} if U_{0} = 1, U_{1} = 3; and n! divides [1, 2, ..., n]U_{n} for all integer initial values U_{0}, U_{1}.
EXERCISE
What are the higher analogues?
8 
Show that if

∞ 

n 

B(z) = 
1 ^{ }√1 – 6z + z² 
∑ 
b_{n }z^{ n}, then the b_{n} = 
∑ 
( 
n k 
)( 
n + k k 
) 
. 

n=0 

k=0 

Find expression for the a_{n} in

z 

∞ 

A(z) = 
1 ^{ }√1 – 6z + z² 
∫ 
dt ^{ }√1 – 6t + t² 
= 
∑ 
a_{n }z^{ n} 

0 

n=0 

and notice that the [1, 2, ..., n]a_{n} all are integers. Show that sequences {a_{n}} (a_{0} = 0, a_{1} = 1) and {b_{n}} (b_{0} = 1, b_{1} = 3) both satisfy nu_{n} + (n – 1)u_{n–2} = (6n – 3)u_{n–1}. Now prove that there is a constant λ such that

∞ 

A(z) – λB(z) = 
∑ 
c_{n }z^{ n} 

n=0 

has no singularity at (√2 – 1)^{2}. Deduce that then c_{n} = O((√2 – 1)^{2n}) and conclude that it follows that ln2 has irrationality degree at most 4.662100831... .

Of course, 6 should remind us that recurrences may be quite irrelevant to the proof. The vital this then is suitable definition of the c_{n,k}, so one is brought back to looking for generalizations of 2 . But, for the present, generalization of Apery's work remains, as they say, a mystery wrapped in an enigma. Well, not really. It is just that it is not at all clear where to go. A numerical test (suggested by Cohen) implies that

∞ 

ζ(4) = 
π^{4} 90 
= 
36 17 
∑ 


n=1 

(so this true for all practical purposes) and it has been shown by Gosper that

∞ 

ζ(5) = 
5 2 
∑ 
( 
1 + 
1 2^{2} 
+ 
1 3^{2} 
+ ... + 
1 (n – 1)^{2} 
– 
4 5n^{2} 
) 

. 

n=2 

David Hawkins (Boulder) suggests similar formulas. Apparently such expressions can be generated virtually at will on using appropriate series accelerator identities. Most startling of all though should be the fact that Apery's proof has no aspect that would not have been accessible to a mathematician of 200 years ago. The proof we have seen is one that many mathematicians could have found, but missed.
This note was written at Queen's University, Kingston, Ontario whilst the author was on study leave from the University of New South Wales, Sydney, Australia (October, 1978).
P.S. See [9] for many delightful facts including the trilogarithm formula of 4 which is given at p.139. At p.89 of [10] one is astonished to be asked to prove as an exercise that
Seeing that
∞ 

∑ 

= 2 arcsin^{2 } 
( 
x 2 
) 
n=1 

([6], p.108) the first three formulae (and the one with trilogarithm) become quite accessible to proof, but I had not detected anyone able to prove the expression for ζ(4), until I proved it in March 1979 after noticing a remark of Lewin that also

π/3 

2 
∫ 
x ln^{2 } 
( 
2 sin 
x 2 
) 
dx = 
17π^{4} 3240 
. 

0 

Sam Wagstaff (Illinois) and Andrew Odlyzko (Bell Labs) have mentioned to me that numerical evidence suggests that there are formulae of the shape 2 and 2' for ζ(t) only for t = 2, 3, 4, and this is verified by my studies in a current manuscript Some wonderful formulae. The recurrences (9) are long known, see [10], p.90. One can recognise the b_{n} as b_{n} = _{4}F_{3}(n+1, –n, n+1, –n; 1,1,1; 1) and determine the recurrence 3 by way of three term relations with contiguous balanced series; see [11].
Frits Beukers (Leiden) [12] has found an elegant approach to Apery's proof which entirely avoids explicit identities, recurrences and other magic. Instead just consider

1 
1 

I = – 
1 2 
∫ 
∫ 
P_{n}(x)P_{n}(y) ln xy 1 – xy_{ } 
dxdy = b_{n}ζ(3) – a_{n} 

0 
0 

noticing that the b_{n} are integers and the a_{n} are rationals with the 2[1, 2, ..., n]^{3}a_{n} integers, whilst  I  ≤ ζ(3)(√2 – 1)^{4n}. Here P_{n}(z) = (d/dz)^{n}[z^{n}(1 – z^{n})]/n! is the Legendre polynomial. Again, there is no obvious way to generalize the proof.
In retrospect it seems clear that 8 really is useful; implications are being considered by Bombieri et al (at Princeton). For example, one's intuition is just wrong in feeling incredulity at the facts of 3 . All that this report is that the differential equation
d dx 


(x^{4} – 34x^{3} + x^{2 }) 
d^{3}y dx^{3} 
+ (6x^{3} – 103x^{2} + 3x) 
d^{2}y dx^{2} 
+ 
+ (7x^{2} – 112x + 1) 
dy dx 
^{ }+ (x – 5)y – (u_{1} – 5u_{0}) 

= 0 
has two Gfunction solutions, namely a(x) = a_{1 }x + a_{2 }x^{2} + ...; b(x) = 1 + b_{1 }x + b_{2 }x^{2} + ...; and a(x) – ζ(3)b(x) is regular (in fact vanishes) at α = (√2 – 1)^{4}. This is interesting, but no longer incredible; and it is readily generalizable... All this too is an idea of Beukers. Some officious readers have been critical of my casual use of the Osymbol; the fault is mine, not Apery's. No harm is done. Similarly it has been claimed that Apery's proof was not missed by Euler – «Euler did not know the prime number theorem»; to me it seems hypercritical to suggest that [1, 2, ..., n] = O((√2 + 1)^{4n/3}) could not have been noticed at the time, had it been needed. Anyhow, I considered it a racy title. It arose after Cohen's report at Helsinki, with someone sourly commenting «A victory for the French peasant...»; to this Nick Katz retorted: «No...! No! This is marvellous! It is something Euler could have done...».
School of Mathematics and Physics, Macquarie University North Ryde, New South Wales, Australia 2113 

March 1979 
References
I. Niven. Irrational Numbers (Carus Monograph #11). MAAWiley, 1967. назад к тексту
Bruce C. Berndt. Modular transformations and generalizations of several formulae of Ramanujan. Rocky Mountain J. of Maths 7 (1977), pp. 147–189. назад к тексту
Bruce C. Berndt. Elementary evaluation of ζ(2n). Maths Mag. 48 (1945), pp. 148–153. назад к тексту
Margrethe Munthe Hjortnaes. Overforing av rekken ∑^{ 1}/k³ til et bestemt integral. Proc. 12th Cong. Scand. Maths, Lund 10–15 Aug. 1953 (Lund 1954). назад к тексту
R. William Gosper, Jr. A calculus of series rearrangements. In "Algorithms and Complexity, New Directions and Recent Results", ed. J. Traub. Academic Press, 1976, pp.121–151. назад к тексту
Z. R. Melzak. Introduction to Concrete Mathematics. Wiley, 1973, p.85. назад к тексту
J. Barkley. Rosser and lowell Schoenfeld. Math. Comp. 29 (1975) pp.243–269. назад к тексту
K. Mahler. Applications of some formulae by Hermite to the approximation of exponentials and logarithms. Math. Annalen 168 (1967) pp.200–227. назад к тексту
Leonard Lewin. Dilogarithms and associated functions. Macdonald, London, 1958. назад к тексту
Louis Comtet. Advanced Combinatorics. D.Reidel, Dordrecht, 1974. назад к тексту
J. A. Wilson. Hypergeometric series, recurrence relations and some new orthogonal functions. Ph.D.Thesis; U.Wisconsin–Madison, 1978. назад к тексту
Frits Beukers. A note on the irrationality of ζ(2) and ζ(3). J. Lond. Math. Soc. (to appear). назад к тексту