The energy landscape is a conceptual tool for describing dynamical behavior in physics. For a particle moving in 2D under conservative forces, the landscape is simply a plot of the potential (and the force, its gradient, $$ - \nabla \phi $$). For thermodynamic systems, the horizontal plane represents the space of all system configurations. The coordinates need not be Euclidean or even continuous. (For proteins, they might be rotation angles of all bonds along the polymer backbone.) Free energies (e.g., Helmholtz $$ A = U - TS $$) balance energetic and entropic forces. In biological conditions, an unfolded protein traverses its rough, funnel-shaped landscape, moving "downhill" toward low-free-energy folded configurations.

Evolutionary dynamics can be framed in similar terms. The role of free energy is played by fitness, and the landscape is inverted (fitness is

Evolutionary models are diverse in their purposes, and in the variety of assumptions they make (e.g., random mating). The Bak-Sneppen model, for instance, is of great theoretical interest; it shows how self-organized criticality might explain power law extinction statistics seen in the fossil record. Ewens's sampling formula, on the other hand, is frequently applied to real populations to look for signatures of neutral evolution. Our work, Khromov et al., 2018, generalizes Ewens's formula to arbitrary fitness landscapes. It is applicable for populations on large sequence networks when the evolutionary forces of selection, mutation, and genetic drift all balance, creating a steady state de-labeled allele frequency distribution.

Not surprising, the generalization comes with more complicated mathematics. Even after coarse-graining to a two- or three-plane landscape, the sampling probabilities still contain nested sums and infinite series. My primary role was to write a theory code to efficiently compute our analytic results and validate them against simulations. Values for $$\mathcal{F}$$ (the generalized $$ _1F_1 $$) were computed via a matrix of Bell polynomials. The choice of parameter values affected sum convergence, so truncation had to be done carefully. The GNU Scientific Library was used extensively, especially special function routines which check for overflow/underflow and facilitate calculations in $$ \log $$ space.