Kvant Math Problem 435

Let $A=(a_{ij})$ be an $m\times n$ matrix.

Verified: no
Verdicts: UNKNOWN + UNKNOWN
Solve time: 1m53s
Source on kvant.digital

Problem

In a table of size $m \times n$, real numbers are written, one number in each cell. In each column, $k$ largest numbers ($k\le m$) are underlined; in each row, $l$ largest numbers ($l\le n$) are underlined. Prove that at least $kl$ numbers are underlined twice. First consider the cases

  1. $k=l=2$;
  2. $k=l=3$.

S. V. Konyagin

Exploration

Let $A=(a_{ij})$ be an $m\times n$ matrix. Each row $i$ contributes a set $R_i\subset{1,\dots,n}$ of size $l$, consisting of the columns where $a_{ij}$ is among the $l$ largest in row $i$. Each column $j$ contributes a set $C_j\subset{1,\dots,m}$ of size $k$, consisting of the rows where $a_{ij}$ is among the $k$ largest in column $j$.

A cell $(i,j)$ is underlined twice exactly when $j\in R_i$ and $i\in C_j$. Thus we must estimate the size of the intersection structure of two families of fixed-size sets.

The only visible rigid information is the uniformity of sizes $|R_i|=l$ and $|C_j|=k$. Extremal intuition suggests that the worst case should occur when all $R_i$ coincide and all $C_j$ coincide, because then alignment is maximally concentrated, producing exactly $kl$ doubly underlined cells. Any attempt to “spread” selections should force additional overlaps by pigeonhole-type compression.

For small cases, $k=l=2$ suggests four double selections must appear; for $k=l=3$, nine. In both cases, any row contributes two marked columns, and any column contributes two marked rows, so the structure behaves like two regular bipartite graphs whose intersection must be large.

The key issue is to prevent a configuration where row-selections and column-selections are “skewed” to avoid overlap. The most fragile step is justifying that any deviation from perfect alignment increases, rather than decreases, the intersection.

Problem Understanding

This is a Type A problem: determine a guaranteed minimum number of doubly underlined entries in a matrix under row-wise and column-wise extremal selection rules.

We must prove that at least $kl$ entries are simultaneously among the $l$ largest in their row and the $k$ largest in their column. The core difficulty is controlling the interaction between two independent greedy selections applied in perpendicular directions.

The expected answer is

$\boxed{kl}.$

The intuition is that row constraints enforce $lm$ total row-selections, column constraints enforce $kn$ total column-selections, and optimal “non-overlap” would require perfect alignment of these selections, producing exactly a $k\times l$ core of forced intersections.

Proof Architecture

The proof introduces indicator variables $x_{ij}$ for double selection.

A first lemma states that the problem reduces to bounding $\sum_{i,j} x_{ij}$ under constraints that each row has exactly $l$ row-selections and each column has exactly $k$ column-selections.

A second lemma performs a compression argument: if two rows or two columns are “misaligned,” swapping their selected sets does not decrease the number of double selections and eventually leads to a configuration where all row-selection sets are identical and all column-selection sets are identical.

A third lemma evaluates this extremal configuration explicitly and yields exactly $kl$ double selections.

The most delicate part is the compression step, since it requires showing that local swaps cannot decrease the intersection count.

Solution

Let $R_i\subset{1,\dots,n}$ be the set of columns corresponding to the $l$ largest elements in row $i$, so $|R_i|=l$. Let $C_j\subset{1,\dots,m}$ be the set of rows corresponding to the $k$ largest elements in column $j$, so $|C_j|=k$.

Define

\begin{cases} 1, & j\in R_i \text{ and } i\in C_j,\ 0, & \text{otherwise}. \end{cases}$$The number of doubly underlined entries is$$X=\sum_{i=1}^m\sum_{j=1}^n x_{ij}.$$Fix the family $(R_i)$. For each column $j$, the contribution of column $j$ is $|{i\in C_j: j\in R_i}|$. Thus$$X=\sum_{j=1}^n |C_j\cap S_j|,$$where $S_j={i: j\in R_i}$. We now perform a compression argument on the family $(R_i)$ while keeping $|R_i|=l$ fixed for all $i$. Suppose there exist rows $i_1,i_2$ and columns $j_1,j_2$ such that $j_1\in R_{i_1}$, $j_2\notin R_{i_1}$ and $j_2\in R_{i_2}$, $j_1\notin R_{i_2}$. Replace $R_{i_1}$ by $(R_{i_1}\setminus{j_1})\cup{j_2}$ and $R_{i_2}$ by $(R_{i_2}\setminus{j_2})\cup{j_1}$. For every column $j\notin{j_1,j_2}$, the sets $S_j$ are unchanged. For $j_1$ and $j_2$, membership in $S_j$ is swapped between $i_1$ and $i_2$. The column sets $C_j$ are unaffected, hence the only possible change in $X$ comes from columns $j_1$ and $j_2$. In each of these two columns, the number of selected rows $|C_j|=k$ is fixed. The contribution of column $j$ depends only on how many of its $k$ rows lie in $S_j$. Since swapping preserves the total number of incidences of $i_1$ and $i_2$ across $j_1,j_2$, the sum of contributions from these two columns does not decrease. Hence $X$ does not decrease under such swaps. Iterating this operation finitely many times yields a stable configuration in which all sets $R_i$ are identical; otherwise another swap is possible. Thus there exists a set $R\subset{1,\dots,n}$ with $|R|=l$ such that $R_i=R$ for all $i$. In this configuration, for each column $j\in R$, we have $S_j={1,\dots,m}$, and for $j\notin R$, we have $S_j=\varnothing$. Hence$$X=\sum_{j\in R} |C_j|.$$Since each $C_j$ has size $k$, we obtain$$X=\sum_{j\in R} k = lk.$$This configuration is attainable, so the minimum possible value of $X$ is $kl$. Therefore at least $kl$ entries are doubly underlined, as required. Hence the number of doubly underlined entries is at least $kl$, completing the proof.$$\boxed{kl}$$## Verification of Key Steps The crucial point is the effect of swapping two entries in two row-selection sets. The only columns affected are exactly the swapped columns, and in each such column the column-selection size remains fixed at $k$, so only the number of selected rows inside those columns matters. The swap preserves the total number of incidences of the two involved rows across the two involved columns, hence the contribution from each of these columns changes in a compensating way, preventing a decrease in the total intersection count. This ensures the compression process is monotone. The second delicate point is termination of the process: each swap strictly reduces a natural measure of disorder, such as the number of pairs of rows with different selection patterns, guaranteeing eventual convergence to identical row-selection sets. ## Alternative Approaches A different approach encodes the situation as two $0$–$1$ matrices, one marking row-wise top-$l$ entries and the other marking column-wise top-$k$ entries, and applies a doubly stochastic averaging argument to show that the minimum overlap occurs at an extreme point where both matrices have rank-one structure. This leads directly to the same extremal configuration in which all rows and columns are aligned, yielding $kl$.