Kvant Math Problem 1246
Let the arithmetic progression be
Verified: no
Verdicts: SKIP + SKIP
Solve time: 7m04s
Source on kvant.digital
Problem
Prove that in any infinite arithmetic progression whose terms are natural numbers, there exist two numbers with the same sum of digits.
S. Genkin
Leningrad City Mathematical Olympiad (1990)
Exploration
Let the arithmetic progression be
$$a,\ a+d,\ a+2d,\dots$$
with $a,d\in\mathbb N$.
The statement concerns the sum of digits function, usually denoted by $s(n)$. We must show that among the infinitely many values $s(a+kd)$, two are equal.
A first thought is that the digit sum grows very slowly compared with the numbers themselves. If the values $s(a+kd)$ were all distinct, then infinitely many distinct positive integers would appear as digit sums. Since $s(n)\le 9(\lfloor \log_{10}n\rfloor+1)$, the digit sum is at most logarithmic in $n$, whereas the progression contains linearly many terms up to a given bound. This suggests a counting argument.
Take the first $N$ terms of the progression. Their digit sums are bounded by the largest term, namely $a+(N-1)d$. Every one of these digit sums lies between $1$ and
$$9\bigl(\lfloor \log_{10}(a+(N-1)d)\rfloor+1\bigr).$$
Hence there are only about $9\log_{10}N$ possible digit-sum values among the first $N$ terms, while there are $N$ terms. For large $N$, the number of terms greatly exceeds the number of possible digit sums. Then two terms must share the same digit sum.
The only point requiring care is to make the counting completely rigorous and ensure that the upper bound on the digit sum is valid for every positive integer.
The core idea is therefore the pigeonhole principle combined with the logarithmic upper bound for digit sums.
Problem Understanding
We are given an infinite arithmetic progression of natural numbers,
$$a,\ a+d,\ a+2d,\dots,$$
and must prove that two terms of this progression have the same sum of decimal digits.
This is a Type B problem, a pure proof problem.
The core difficulty is to compare the number of terms in a long initial segment of the progression with the number of different digit-sum values that those terms can possibly attain. The progression produces linearly many numbers, whereas digit sums of numbers of a given size occupy only a logarithmically growing range.
Proof Architecture
The proof uses two claims.
First, if a positive integer has $m$ decimal digits, then its digit sum does not exceed $9m$. This follows because each digit is at most $9$.
Second, among the first $N$ terms of the progression, every digit sum is at most
$$9\bigl(\lfloor \log_{10}(a+(N-1)d)\rfloor+1\bigr),$$
so there are at most that many possible digit-sum values.
The main argument applies the pigeonhole principle. For sufficiently large $N$, the number $N$ of terms exceeds the number of possible digit-sum values, forcing two terms to have equal digit sums.
The most delicate point is the comparison between the linear quantity $N$ and the logarithmic quantity
$$9\bigl(\lfloor \log_{10}(a+(N-1)d)\rfloor+1\bigr).$$
Solution
Let $s(n)$ denote the sum of the decimal digits of the positive integer $n$.
Consider the arithmetic progression
$$a,\ a+d,\ a+2d,\dots,$$
where $a,d\in\mathbb N$.
For each positive integer $m$, every $m$-digit number has digit sum at most $9m$, because each of its $m$ digits is at most $9$. Hence, for every positive integer $n$,
$$s(n)\le 9\bigl(\lfloor \log_{10} n\rfloor+1\bigr).$$
Now fix a positive integer $N$ and examine the first $N$ terms of the progression:
$$a,\ a+d,\dots,a+(N-1)d.$$
Their largest element is
$$M=a+(N-1)d.$$
Since every one of these terms does not exceed $M$,
$$s(a+kd)\le 9\bigl(\lfloor \log_{10} M\rfloor+1\bigr) \qquad (0\le k\le N-1).$$
Therefore all digit sums occurring among the first $N$ terms belong to the set
$${1,2,\dots,9(\lfloor\log_{10}M\rfloor+1)}.$$
Consequently, the number of distinct digit-sum values among the first $N$ terms is at most
$$9\bigl(\lfloor \log_{10} M\rfloor+1\bigr).$$
Since
$$M=a+(N-1)d\le (a+d)N$$
for every $N\ge1$, we obtain
$$9\bigl(\lfloor \log_{10} M\rfloor+1\bigr) \le 9\bigl(\log_{10}((a+d)N)+1\bigr).$$
The right-hand side grows logarithmically with $N$, whereas $N$ itself grows linearly. Hence there exists a positive integer $N$ such that
$$N> 9\bigl(\log_{10}((a+d)N)+1\bigr).$$
For this choice of $N$,
$$N> 9\bigl(\lfloor \log_{10} M\rfloor+1\bigr).$$
Thus the first $N$ terms of the progression produce more numbers than there are possible digit-sum values among them.
By the pigeonhole principle, two of these $N$ terms must have the same digit sum. Hence there exist indices $i\ne j$ such that
$$s(a+id)=s(a+jd).$$
Thus the arithmetic progression contains two numbers with the same sum of digits.
This completes the proof.
∎
Verification of Key Steps
The first delicate step is the bound for digit sums. If $n$ has $m$ digits, write
$$n=\overline{d_md_{m-1}\dots d_1}.$$
Then
$$s(n)=d_1+\cdots+d_m.$$
Since each digit satisfies $0\le d_i\le9$,
$$s(n)\le 9m.$$
Because $m=\lfloor\log_{10}n\rfloor+1$, the stated inequality follows.
The second delicate step is counting possible digit sums. The argument does not require that every value from $1$ up to the bound actually occurs. It uses only that every digit sum lies inside that interval. Therefore the number of distinct digit sums is at most the length of the interval, which is sufficient for the pigeonhole principle.
The third delicate step is comparing growth rates. We need an actual inequality, not merely the statement that one function is linear and the other logarithmic. Since
$$\frac{9(\log_{10}((a+d)N)+1)}{N}\to0 \qquad (N\to\infty),$$
there exists $N$ for which
$$N> 9(\log_{10}((a+d)N)+1).$$
That is the precise fact used in the proof.
Alternative Approaches
A different proof uses residues modulo $9$. The identity
$$s(n)\equiv n \pmod 9$$
shows that all terms of the progression have digit sums lying in only one residue class modulo $\gcd(d,9)$. One can then combine this congruence restriction with the bound $s(n)\le 9(\lfloor\log_{10}n\rfloor+1)$ and count how many admissible digit-sum values are available among the first $N$ terms. The number of admissible values still grows only logarithmically, while the number of terms grows linearly, leading again to a pigeonhole argument.
The main proof is preferable because it avoids congruence considerations entirely. The logarithmic bound on digit sums and the pigeonhole principle already yield the result in a direct and transparent way.