A researcher is staring at a dataset that contains measurements from 47,000 human genes in a lab somewhere that is probably unremarkable from the outside, fluorescent-lit, and cluttered with half-empty coffee cups. They were all obtained from a single sample of tissue. all of them at the same time. Just the biology is too much to handle. However, the mathematics of what follows is even more bizarre and far more fascinating than it usually receives credit for.
For years, high-dimensional geometry—a branch of mathematics that deals with spaces with hundreds or thousands of dimensions instead of the three we encounter on a daily basis—has been subtly influencing drug discovery and genomics. Press conferences are not held when it arrives. It has no celebrity supporters. It’s the kind of advancement that usually only becomes apparent after the fact, when someone looks back and discovers that an issue that previously appeared unsolvable has, in some way, been resolved.

Finding a molecule that binds to a particular protein or section of DNA and performs a beneficial function without also harming other areas has always been the main challenge in drug discovery. In that sense, it sounds almost straightforward. In reality, it would take centuries to test each candidate individually due to the size of the search space for viable molecules. On average, it takes more than ten years and about $2.6 billion to bring even one medication to market. Despite decades of effort, those figures haven’t changed much, which is likely why researchers began searching for completely different approaches to the issue.
Geometry provides a way to comprehend structure that statistical brute force does not. Patterns that have no clear equivalent in lower-dimensional thinking appear when genomic data is viewed as points in a high-dimensional space, with each gene measurement serving as a coordinate. Clusters develop. Distances have significance. It’s possible that two cancer patients with identical surface-level diagnoses are sitting on opposite ends of this abstract geometric landscape, which could account for every reason why they react differently to the same treatment. Underneath its clinical exterior, the MammaPrint signature—which was created using 25,000 human genes from almost 300 cases of breast cancer—is actually navigating geometry.
The majority of clinicians who use these tools might never consider them in this manner. And that’s okay. The idea that the mathematics describing the shape of a 70-gene expression profile is somewhat similar to the mathematics describing the shape of a protein pocket that a drug molecule must fit into, however, raises a legitimate question worth considering. Both require navigating areas that are beyond the capabilities of human intuition.
It also has a somewhat humble quality. For a long time, it was assumed that more information would inevitably lead to better solutions. Instead, researchers encountered the paradoxical phenomenon known as the “curse of dimensionality,” which states that increasing the number of measurements can actually make it more difficult to identify patterns because the data becomes sparser in relation to the amount of space it takes up. High-dimensional geometry exhibits unexpected behaviors. Distances are compressed. Clusters disintegrate. Many of the early high-dimensional genomic models may not have adequately explained this, which raises serious concerns about some conclusions made prior to a deeper understanding of the mathematics.
The current situation feels different. There appears to be a convergence of improved computational tools, sharper geometric intuitions, and a willingness to approach drug design as essentially a problem of shape and space rather than trial and error. In a field that doesn’t often make headlines, it’s difficult to ignore the growing momentum. There was always the math. It’s only recently that medicine has learned to use it.

