Prime Number Theorem And ...

7 Explicit estimates

We will try to systematically collect explicit estimates related to the prime number theorem from the literature, and formalize them in a modular fashion. We divide such estimates into four classes:

  • Zeta function explicit estimates: bounds on the zeta function and its zeroes.

  • Primary explicit estimates: those that are directly control \(\psi (x)\) and \(M(x)\), usually via information on the zeta function.

  • Secondary explicit estimates: these are useful general-purpose estimates on functions relating to the primes, such as bounds on the \(n\)-th prime, or estimates for the prime counting function \(\pi (x)\). These are generally derived from primary estimates and elementary arguments.

  • Tertiary explicit estimates: these are bespoke applications to particular problems in analytic number theory or combinatorics that often require secondary estimates as input.

In this project we will state the best available zeta and primary estimates known in the literature, and try to formalize at least some of them; state the best available secondary estimates known in the literature, as well as various tools from passing from primary to secondary estimates, and formalize these; and then finally formalize some tertiary estimates as applications of the secondary ones.

7.1 Mertens’ theorems

In this section we give explicit versions of Mertens’ theorems:

  • Mertens’ first theorem (von Mangoldt form): \(\sum _{n \leq x} \frac{\Lambda (n)}{n} = \log x + O(1)\).

  • Mertens’ first theorem (prime form): \(\sum _{p \leq x} \frac{\log p}{p} = \log x + O(1)\).

  • Mertens’ second theorem (von Mangoldt form): \(\sum _{n \leq x} \frac{\Lambda (n)}{n \log n} = \log \log x + \gamma + O(1/\log x)\).

  • Mertens’ second theorem (prime form): \(\sum _{p \leq x} \frac{1}{p} = \log \log x + M + O(1/\log x)\), where \(M\) is the Meissel-Mertens constant.

  • Mertens’ third theorem: \(\prod _{p \leq x} (1 - \frac{1}{p}) = e^{-\gamma }/\log x + O(1/\log ^2 x)\).

We aim to upstreaming these results to Mathlib. In particular, the arguments here should be self-contained and written for efficiency, coherency, and clarity. As such, extensive use of AI tools is strongly discouraged in this section.

The arguments here are drawn from Leo Goldmakher’s “A quick proof of Mertens’ theorem” from https://web.williams.edu/Mathematics/lg5/mertens.pdf

The unfinished formalization of Mertens’ theorems by Arend Mellendijk in https://github.com/FLDutchmann/Analytic/blob/main/Analytic/Mertens.lean may also be relevant here.

Lemma 7.1.1 Partial sum of logarithm identity
# Discussion

For any \(x \geq 1\), one has

\[ \sum _{n \leq x} \log n = x \log x - \{ x \} \log x - x + 1 + \int _1^x \{ t \} \frac{dt}{t} \]

(NOTE: this identity is not actually needed in the proof of Mertens’ theorems, but may be worth recording nevertheless.)

Proof

We have

\begin{align*} \sum _{n \leq x} \log n & = \int _1^x \log t \, d\lfloor t \rfloor \\ & = \log x \cdot \lfloor x \rfloor - \int _1^x \frac{\lfloor t \rfloor }{t} dt \\ & = x \log x - \{ x \} \log x - \int _1^x \frac{t}{t} dt + \int _1^x \{ t \} \frac{dt}{t}. \end{align*}
Lemma 7.1.2 Partial sum of logarithm upper bound
# Discussion

For any \(x \geq 1\), one has

\[ \sum _{n \leq x} \log n \leq x \log x. \]
Proof

Trivial since \(\log n \leq \log x\).

Corollary 7.1.1 Partial sum of logarithm lower bound
# Discussion

For any \(x \geq 1\), one has

\[ \sum _{n \leq x} \log n \geq x \log x - 2 x. \]
Proof

We have

\begin{align*} \sum _{n \leq x} \log n & = \sum _{2 \leq n \leq \lfloor x \rfloor } \log n \\ & \geq \sum _{2 \leq n \leq \lfloor x \rfloor } \int _{n-1}^n \log t \, dt \\ & = \int _1^{\lfloor x \rfloor } \log t \, dt \\ & \geq \int _1^x \log t\ dt - \log x \\ & = x \log x - x - \log x \\ & \geq x \log x - 2 x. \end{align*}

Here we use the monotonicity of \(\log n\) (and its vanishing at \(n=1\)) and the crude bound \(\log x \leq x\). Note: the tools at Mathlib.Analysis.SumIntegralComparisons may be useful.

Proposition 7.1.1 Partial sum of logarithm as logarithm of factorial
# Discussion

For any \(x \geq 1\), one has

\[ \sum _{n \leq x} \log n = \log (\lfloor x \rfloor !). \]
Proof

Follows from the definition of the factorial function. Is not needed for the Mertens theorems, but is a natural fact to have.

Lemma 7.1.3 Partial sum of logarithm as sum of \(\Lambda (d)/d\)
# Discussion

For any \(x \geq 1\), one has

\[ \sum _{n \leq x} \log n = \sum _{d \leq x} \Lambda (d) \lfloor \frac{x}{d} \rfloor . \]
Proof

We have

\begin{align*} \sum _{n \leq x} \log n & = \sum _{n \leq x} \sum _{d \mid n} \Lambda (d) = \sum _{d \leq x} \Lambda (d) \sum _{n \leq x, d \mid n} 1 = \sum _{d \leq x} \Lambda (d) \left\lfloor \frac{x}{d} \right\rfloor . \end{align*}
Definition 7.1.1 The remainder term in Mertens first theorem (von Mangoldt form)

We define \(E_{1,\Lambda }(x) := \sum _{d \leq x} \frac{\Lambda (d)}{d} - \log x\).

Corollary 7.1.2 Partial sum of \(\Lambda (d)/d\) lower bound
# Discussion

For any \(x \geq 1\), one has

\[ E_{1,\Lambda }(x) \geq - 2. \]
Proof

Insert Lemma 7.1.3 into Lemma 7.1.1 and lower bound \(x/d\) by \(\lfloor x/d \rfloor \).

Corollary 7.1.3 Partial sum of \(\Lambda (d)/d\) upper bound
# Discussion

For any \(x \geq 1\), one has

\[ E_{1,\Lambda }(x) \leq \log 4 + 4. \]
Proof

Insert Lemma 7.1.3 into Lemma 7.1.2 and upper bound \(x/d\) by \(\lfloor x/d \rfloor + 1\), and use the Mathlib bound \(\psi (x) \leq (\log 4 + 4) x\).

Corollary 7.1.4 Mertens’ first theorem (von Mangoldt form)
# Discussion

For any \(x \geq 1\), one has

\[ \sum _{n \leq x} \frac{\Lambda (n)}{n} = \log x + O(1). \]
Proof

Immediate from previous two corollaries.

Definition 7.1.2 The remainder term in Mertens first theorem (prime form)
#

We define \(E_{1,p}(x) := \sum _{p \leq x} \frac{\log p}{p} - \log x\).

Corollary 7.1.5 Prime error for Mertens first theorem bounded by von Mangoldt error

For any \(x \geq 1\), one has

\[ E_{1,p}(x) \leq E_{1,\Lambda }(x). \]
Proof

Drop all terms in Lemma 7.1.3 arising from prime powers.

Corollary 7.1.6 Partial sum of \(\frac{\log p}{p}\) upper bound

For any \(x \geq 1\), one has

\[ E_{1,p}(x) \leq \log 4 + 4. \]
Proof

Drop all terms in Lemma ?? arising from prime powers.

Proposition 7.1.2 Upper bound on \(E_1\)
# Discussion

One has \(E_1 \leq \frac{5 \log 2 + 3}{4}\)

Proof

We can bound \(E_1 \leq \sum _{n=2}^\infty \frac{\log n}{n(n-1)} \leq \frac{\log 2}{2} + \frac{3}{2} \sum _{n=3}^\infty \frac{\log n}{n^2}\). Calculus shows that \(\log x / x^2\) is decreasing for \(x \geq 2 {\gt} e^{1/2}\), so we can bound \(\sum _{n=3}^\infty \frac{\log n}{n^2} \leq \int _2^\infty \frac{\log t}{t^2}\ dt = \frac{\log 2+1}{2}\).

Corollary 7.1.7 Partial sum of \(\frac{\log p}{p}\) lower bound

For any \(x \geq 1\), one has

\[ E_{1,\Lambda }(x) \leq E_{1,p}(x) + E_1 \]

and thus

\[ E_{1,p}(x) \geq -2 - E_1 \]

where

\[ E_1 := \sum _{p} \frac{\log p}{p(p-1)}. \]
Proof

Use the triangle inequality and the geometric series formula to estimate in Lemma ?? arising from prime powers.

For any \(x \geq 1\), one has

\[ \sum _{p \leq x} \frac{\log p}{p} = \log x + O(1). \]
Proof

Immediate from previous two corollaries.

Definition 7.1.3 Alternate Formula for Euler-Mascheroni constant
#

We set \(\gamma := \int _2^\infty \frac{E_{1,\Lambda }(t)}{t \log ^2 t} \, dt + 1 - \log \log 2\).

Definition 7.1.4 The remainder term in Mertens second theorem (von Mangoldt form)

We define \(E_{2,\Lambda }(x) := \sum _{d \leq x} \frac{\Lambda (d)}{d \log d} - \log \log x - \gamma \).

Sublemma 7.1.1 Integral identity involving inverse log weight

For any \(x \geq 2\) and any \(f : {\mathbb N} \mapsto {\mathbb R}\), one has

\[ \sum _{2 \leq n \leq x} \frac{f(n)}{\log n} = \frac{1}{\log x} \sum _{2 \leq n \leq x} f(n) + \int _2^x \frac{1}{t \log ^2 t} \sum _{2 \leq n \leq t} f(n) \, dt \]
Proof

Establish the identity

\[ \frac{1}{\log n} = \frac{1}{\log x} + \int _2^x \frac{1}{t \log ^2 t} 1_{t \geq n}\ dt \]

for \(2 \leq n \leq x\),multiply by \(f(n)\), then sum.

Corollary 7.1.8 Integral form for second error (von Mangoldt form)

For any \(x \geq 2\), one has

\[ E_{2,\Lambda }(x) = \frac{E_{1,\Lambda }(x)}{\log x} - \int _x^\infty \frac{E_{1,\Lambda }(t)}{t \log ^2 t}\ dt \]
Proof

From Lemma 7.1.1 one has

\[ \sum _{n \leq x} \frac{\Lambda (n)}{n \log n} = \frac{1}{\log x} \sum _{n \leq x} \frac{\Lambda (n)}{n} + \int _2^x \frac{1}{t \log ^2 t} \sum _{n \leq t} \frac{\Lambda (n)}{n} \, dt. \]

Now substitute the definitions of \(E_{1,\Lambda }\), \(E_{2,\Lambda }\), \(\gamma \) and simplify.

For any \(x \geq 2\), one has

\[ |E_{2,\Lambda }(x)| \leq \frac{\log 4 + 6}{\log x}. \]
Proof

Insert Lemma 7.1.3 and Lemma 7.1.2 into Lemma 7.1.8 and use the triangle inequality to obtain the required upper and lower bounds.

Theorem 7.1.2 An identity for \(\log \zeta (s)\)

If \(s {\gt} 1\) then \(\log \zeta (s) = - \log (s-1) + \Gamma '(1) + \gamma + (s-1) \int _1^\infty E_{2,\Lambda }(x) x^{-s}\ ds\).

Proof

First write

\[ \log \zeta (s) = \sum _n \frac{\Lambda (n)}{n^s \log n} \]

and integrate by parts to write this as

\[ (s-1) \int _0^\infty (\log \log x + \gamma + E_{2,\Lambda }(x)) x^{-s}\ dx. \]

Standard calculations give

\[ (s-1) \int _0^\infty \log \log x \cdot x^{-s}\ dx = -\log (s-1) + \Gamma '(1) \]

and

\[ (s-1) \int _0^\infty \gamma \cdot x^{-s}\ dx = \gamma \]

giving the claim.

Theorem 7.1.3 Compatibility with Mathlib Euler-Mascheroni constant

\(\gamma \) is the Euler–Mascheroni constant.

Proof

Take limits as \(s \to 1\) in the previous asymptotic using known asymptotics for \(\zeta (s)\), and using that \(- \Gamma '(1)\) is the Euler–Mascheroni constant.

Theorem 7.1.4 Mertens’ second theorem (weak von Mangoldt form)

For any \(x \geq 2\), one has

\[ \sum _{n \leq x} \frac{\Lambda (n)}{n \log n} = \log \log x + O(1). \]
Proof

Immediate from previous two corollaries.

Definition 7.1.5 The Meissel-Mertens constant
#

We define \(M := \int _2^\infty \frac{E_{1,p}(t)}{t \log ^2 t} \, dt + 1 - \log \log 2\).

Corollary 7.1.10 Upper bound for \(M\)

One has \(M \leq \frac{\log 4 + 4}{\log 2} + 1 - \log \log 2\).

Proof

Insert Lemma 7.1.6 into the definition of \(M\) and use the fact that \(\int _2^\infty \frac{dt}{t \log ^2 t} = 1/\log 2\).

Corollary 7.1.11 Lower bound for \(M\)

One has \(M \geq -\frac{2 + E_1}{\log 2} + 1 - \log \log 2\).

Proof

Insert Lemma 7.1.7 into the definition of \(M\) and use the fact that \(\int _2^\infty \frac{dt}{t \log ^2 t} = 1/\log 2\).

Corollary 7.1.12 Integral form for second error (prime form)

For any \(x \geq 2\), one has

\[ E_{2,p}(x) = \frac{E_{1,p}(x)}{\log x} - \int _x^\infty \frac{E_{1,p}(t)}{t \log ^2 t}\ dt \]
Proof

From Lemma 7.1.1 one has

\[ \sum _{p \leq x} \frac{1}{p} = \frac{1}{\log x} \sum _{p \leq x} \frac{\log p}{p} + \int _2^x \frac{1}{t \log ^2 t} \sum _{p \leq t} \frac{\log p}{p} \, dt. \]

Now substitute the definitions of \(E_{1,p}\), \(E_{2,p}\), \(M\) and simplify.

For any \(x \geq 2\), one has

\[ |E_{2,p}(x)| \leq \frac{\log 4 + 6 + E_1}{\log x}. \]
Proof

Similar to Lemma 7.1.12.

Theorem 7.1.5 Mertens’ second theorem (weak prime form)

For any \(x \geq 2\), one has

\[ \sum _{p \leq x} \frac{1}{p} = \log \log x + O(1). \]
Proof

Immediate from previous two corollaries.

Theorem 7.1.6 Formula for Meissel-Mertens constant

One has \(M = \gamma + \sum _p \log (1-\frac{1}{p}) + \frac{1}{p}\).

Proof

The RHS can be Taylor expanded as \(\sum _{j=2}^\infty \sum _p \frac{1}{jp^j}\). Meanwhile, the difference between \(\sum _{n \leq x} \frac{\Lambda (n)}{n \log n}\) and \(\sum _{p \leq x} \frac{1}{p}\) is equal to \(\sum _{j=2}^\infty \sum _{p: p^j \leq x} \frac{1}{j p^j}\). Applying the monotone convergence theorem, Lemma 7.1.13, and Lemma 7.1.9 gives the claim.

Definition 7.1.6 The remainder term in Mertens third theorem
#

We define \(E_3(x) := \sum _{p \leq x} \log (1 - \frac{1}{p}) + \log \log x + \gamma \).

Theorem 7.1.7 Mertens’ third theorem error term

For any \(x \geq 2\), one has

\[ \prod _{p \leq x} \left(1 - \frac{1}{p}\right) = \frac{e^{-\gamma }}{\log x} \exp (E_3(x)). \]
Proof

Immediate from definition

For any \(x \geq 2\), one has

\[ E_3(x) = O\left(\frac{1}{\log x}\right) \]
Proof

Using the Taylor expansion

\[ \log (1 - \frac{1}{p}) = \sum _{j=1}^\infty \frac{1}{jp^j} = \sum _{j=1}^\infty \frac{\Lambda (p^j)}{p^j \log p^j} \]

one can write

\[ E_3(x) = E_{2,\Lambda }(x) + \sum _{p \leq x} \sum _{j \geq 2: p^j {\gt} x} \frac{j}{p^j}. \]

One can bound \(\sum _{j \geq 2: p^j {\gt} x} \frac{j}{p^j}\) by \(O(1/p^2)\) when \(p {\gt} \sqrt{x}\) and by \(O(1/x)\) when \(p \leq \sqrt{x}\), so the second error here is \(O(1/\sqrt{x})\), giving the claim.