# Symmetry, near-symmetry and energetics

## Transcript Of Symmetry, near-symmetry and energetics

27 March 1998 Chemical Physics Letters 285 Ž1998. 330–336

Symmetry, near-symmetry and energetics

David J. Wales

UniÕersity Chemical Laboratories, Lensfield Road, Cambridge CB2 1EW, UK Received 12 December 1997; in final form 13 January 1998

Abstract

Is there a connection between symmetry and energetics? Surveys of over 300000 local minima from nine different model systems reveal some correlation between the degree of symmetry and the mean energy. However, both theory and calculation suggest that the variance may be more important. Systems with higher symmetry, defined in terms of the spectrum of contributions to the total energy, are more likely to exhibit particularly low and high energy minima. This change in viewpoint may be sufficient to account for the widespread appearance of high symmetry. q 1998 Elsevier Science B.V.

1. Introduction

The aesthetics and mathematics of symmetrical structures have intrigued humankind across the ages, from Pythagoras, Euclid and Plato to da Vinci and Buckminster Fuller w1–5x. The appearance of high symmetry has often been commented upon. Pierre Curie postulated that ‘the symmetry characteristic of a phenomenon is the maximal symmetry compatible with the existence of the phenomenon’ w6x. D’Arcy Thompson wrote that ‘the perfection of mathematical beauty is such . . . that whatsoever is most beautiful and regular is also found to be most useful and excellent’ w7x. However, it is not difficult to construct counterexamples to show that systems do not necessarily exhibit their highest possible symmetry.

The most convincing example of a maximum symmetry result is perhaps the ‘epikernel principle’ of Ceulemans et al., which proposes that when the Jahn–Teller effect operates the resulting symmetry breaking actually maintains as much symmetry as possible w8,9x. One can also prove that so-called

‘balanced structures’ w10x, where all the atoms lie on a rotation axis, are in tangential equilibrium. This result generally means that there is a stationary point of the balanced shape for some particular system size, but the stationary point need not be a minimum w11x.

Recently several authors have referred to ‘symmetries’ exhibited by relatively large biomolecules which cannot be described within the framework of point groups. Chen and Dill have developed an approach based upon knot theory to define and clas-

sify symmetries in compact polymers w12x. Lindg˚ard

and Bohr have found magic numbers for secondary structure in proteins by projecting consecutive secondary structures onto a lattice w13x. Kellman has commented that ‘many globular proteins fold to native states with great regularity in their tertiary structure, reminiscent of symmetry’ w14x. His analysis reveals permutation symmetries in model proteins when the links between the residues on a lattice are ignored. Yue and Dill have suggested that the ubiquity of symmetry arises because symmetric patterns with nondegenerate ground states are easy to design

0009-2614r98r$19.00 q 1998 Elsevier Science B.V. All rights reserved. PII S 0 0 0 9 - 2 6 1 4 Ž 9 8 . 0 0 0 4 4 - X

D.J. Walesr Chemical Physics Letters 285 (1998) 330–336

331

w15x. Kellman’s study supports their view that there is a connection between symmetry and low degeneracy, where in this context degeneracy refers to the number of structures with a similar energy.

Clearly the manifestation of symmetry in biological systems through the duplication of components from the same information can provide an evolutionary advantage. This was the argument used by Watson and Crick in predicting that virus capsids would be made of repeating subunits w16x. There are many other examples of oligomeric biomolecules, including enzymes such as hemoglobin, where the repeated units are in symmetry inequivalent environments and so the overall point group is clearly C1, despite the intuitively symmetric appearance of the molecule. However, Wolynes has recently deduced an approximate connection between the designability of a given heteropolymer structure and the symmetries of a transformed free energy function w17x.

The common association of high symmetry with low energy in the literature and the new results appearing for large biomolecules have been touched upon above. Furthermore, to include the observations for biomolecules we must extend our notion of symmetry beyond point groups. The present contribution addresses both issues. A qualitative argument suggests that high symmetry structures are more likely to have either particularly low or particularly high energies, even when the average energy depends only weakly upon symmetry. Extensive surveys of local minima for a number of model systems generally exhibit the predicted features. The resulting change of viewpoint may be sufficient to explain the prevalence of symmetry, including some insight into the effects of approximate symmetry.

2. The consequences of symmetry and near-symmetry

For any system we can write the potential energy as a series of terms representing the interactions between increasingly large sets of atoms:

E

s

e

Ž0.

q

Ý

e

Ž1. i

q

Ý

Ý

e

Ž2 ij

.

i

j-i i

qÝ

Ý

Ý

e

Ž3. i jk

q

.

.

.

,

Ž1.

k-j j-i i

where, for example, eiŽj2. represents the two-body energy contribution from atoms i and j. If the

molecule has N atoms then the last term in the series

represents the N-body interaction. The above form is

generally applicable, except where quantum mechan-

ical states approach degeneracy, in which case we

must instead write separate series for every term in a

matrix whose eigenvalues then give the allowed

energy levels. We will therefore proceed on the basis that the total energy is a sum of contributions Äet4 resulting from one-body, two-body, etc., interactions. Here Äet4 is the set containing all the various energy contributions eiŽjm. ... in Ž1., ranked in ascending order: e1Fe2Fe3 . . . .

Usually symmetry tells us when integrals repre-

senting quantum mechanical expectation values van-

ish, enabling us to derive spectroscopic selection

rules, for example. However, it is important to re-

member that symmetry can also tell us when parame-

ters such as bond lengths and angles must be the

same. The total energy is invariant under any point

group symmetry operation, Qˆ. Qˆ must permute the

energy contributions Äet4 in such a way that E is unchanged. Hence the Äet4 can be separated into sets which are permuted only amongst themselves and

therefore form representations of the appropriate point group, G. Each set of Äet4 which is mapped

into itself under any operation Qˆg G and cannot be

further subdivided into smaller sets is called an orbit of the point group. For the energy contributions, Äet4, all the members of an orbit have the same numerical

value. The number of members in an orbit must be

equal to the order of a subgroup of G, ranging from one to hG , the order of G itself.

The ordered set of energy contributions, Äet4, may contain exact duplications whenever the system pos-

sesses geometrical symmetry elements. Hence we

may write the total energy as

M

n

E s Ý ei s Ý Ai eiX ,

Ž2.

is1

is1

where M is the total number of members in the set Äet4, which is fixed for a given molecular stoichiometry, and Ýnis1 Ai s M. ÄetX4 is then the ordered set of unique energy contributions, with duplicates re-

moved. The allowed values for the Ai correspond to

332

D.J. Walesr Chemical Physics Letters 285 (1998) 330–336

orbits of the prevailing point group. For example, if

the molecule has point group Ih then the allowed values of Ai are w18x 1, 12, 20, 30, 60 and 120.

In the present analysis we extend the notion of

symmetry to include cases where there are close, but

not exact, degeneracies in the set Äet4. Here we must inevitably introduce a new parameter, r, which cor-

responds to the resolution with which we view the Äet4 distribution. The ÄetX4 set is now constructed by replacing any sequence ej, ejq1, . . . , ejqm, where ejq1 y ej - r, ejq2 y ejq1 - r, . . . , ejqm y ejqmy1 r, with the mean value ejX s Ýiissjjqmeirm. If ejX appears with weight AXj s m then the expression for the total energy is still exact, independent of the resolu-

tion:

nX

E s Ý AXi eiX .

Ž3.

is1

For example, in a purely oligomeric molecule with

point group C1 we would expect many AXi to be

characteristic of the point group appropriate for the

repeated unit, rather than unity, if the resolution is

chosen appropriately. Hence, at the expense of one

new parameter, r, we can incorporate both exact and

near symmetry. In this framework the amount of

symmetry may be quantified either by the mean

value of AXi, ² AX:, Žlarge for high symmetry. or the

number of terms in the sum, nX Žsmall for high

symmetry.. These measures are equivalent because

the total number of energy contributions is fixed for

a

given

system

at

M,

so

that

M

s

nX

Ýis

1

AXi

s

nX²

AX

:.

We note that Avnir and coworkers have also devel-

oped a continuous measure of symmetry w19x, and it

may be possible to connect the two approaches in

future work.

To investigate the relation between symmetry and

energy we must consider the probability distribution

for E as a function of nX. It is only possible to draw

qualitative conclusions because E is the sum of nX

distinct AXi eiX terms, which are not really independent. Furthermore, both AXi and eiX depend upon the

geometry Žand the resolution, r .. To make contact

with thermodynamics we note that the potential en-

ergy makes the most important contribution to the

free energy at low temperature in the canonical

ensemble. The present data would allow approximate

calculations of free energies or entropies, but we will

concentrate on the potential energy here to avoid further approximations.

The mean and variance of the sum of nX variables, xi, drawn at random from the same distribution, PŽ x., are simply nXm and nXs 2, where m and s 2 are the mean and variance of PŽ x.. Eq. Ž3. shows that the energy can be written as nX terms of the form AXi eiX. To a first approximation it is reasonable to assume that the mean value of these terms will scale as 1rnX. This scaling is necessary to give an average total energy that does not vary strongly with nX, as found in the numerical examples below. The variance of the AXi eiX terms is then expected to scale as 1rŽ nX .2, which suggests that the variance of the energy should scale as 1rnX. In other words, even if the expected energy of a given minimum does not depend strongly on symmetry, the variance will. Hence it is not necessary to associate low energy with high symmetry to explain the prevalence of high symmetry. Due to the larger variance we expect to find both significantly higher and lower energy minima amongst structures with higher symmetry measures, as judged by the value of nX. This observation is sufficient to account for the predominance of high symmetry amongst the lowest energy isomers. We also expect the lowest minima to occur in a tail of the probability distribution, whose width increases with increasing symmetry, according to the measure nX. These results may help to explain the observation of Yue and Dill w20x that lattice model proteins exhibit relatively few conformers with low energy and high ‘symmetry’.

A more quantitative result can be obtained by applying the central limit theorem to obtain the distribution of the energy as a function of nX, which assumes that nX is large and that the contributions to E are uncorrelated. The first assumption is reasonable for the systems of interest here, but the second assumption is only an approximation. The resulting probability distribution is:

( P Ž E,nX . s

1

2p nX²Ž AXeX . 2 :nX

² : Ž E y nX² AXeX:nX .2

=exp y

X

XX 2

,

Ž4.

2 n Ž Ae . nX

D.J. Walesr Chemical Physics Letters 285 (1998) 330–336

333

where the averages denoted by angle brackets are for fixed nX. The mean of the distribution is therefore

nX² AXeX:nX and the variance is nX²Ž AXeX . 2 :nX. As above,

we might expect the mean to change only weakly with nX and the variance to scale roughly as 1rnX.

For a system which possesses symmetry in terms of weights, AXi, that are greater than unity, the search for local minima with fixed values of the AXi is a constrained optimization problem. There are bound to be fewer solutions to this problem than when all the atoms are allowed to move independently. Hence we expect the number of local minima on the potential energy surface to decrease with increasing symmetry, where again we measure symmetry in terms

of ² AX: or nX.

3. Numerical results

Nine different model systems were considered. The smallest is a 13-atom cluster bound by the Morse potential. In reduced units there is one free parameter, r0, left which corresponds to the range of the interaction w22x. The results are for almost complete sets of minima at four different values of r0. The six other systems all have astronomically large numbers of local minima for which representative samples were collected. We present results for 55and 100-atom clusters bound by the Lennard-Jones potential, the ‘three-colour, 46-bead’ model polypeptide adapted from the work of Thirumalai w27x by Berry et al. 24, the cluster Ag80 modelled by a many-body Sutton–Chen potential w25x and the ŽH 2O.20 cluster modelled by the rigid body TIP4P potential w26x. The model polypeptide potential contains three- and four-body forces via the bond-angle and dihedral-angle terms, and the Sutton–Chen potential includes pairwise and N-body terms. The remaining potentials are all pairwise additive.

The price for extending the analysis to incorporate the continuous symmetry measure is the resolution parameter, r. If r is too small then only exact degeneracies in the Äet4 set will be detected, while if r is too large then unrelated ei will be lumped together. Neither of these limits is appropriate, and so we expect to find an optimal resolution for each system where the distribution of the local minima

energies as a function of nX best agrees with the theory. For each of the above models we have analyzed the Äet4 over resolutions varying by at least six orders of magnitude and we have also attempted to fit the distributions to various functional forms. Here we will avoid further numerical presentations by simply showing plots of the local minima energies E against nX for the resolution which appears to give the best agreement with the above theory ŽFig. 1..

In general the predicted trends are quite well borne out in Fig. 1, with the widths of the distributions increasing as nX decreases. In judging the quality of this agreement it is important to note that the population of minima at high nX vastly exceeds that at small nX, and that this effect is exacerbated by the difficulty in sampling the sparsely populated high energy regions at small values of nX. Hence it is not surprising that there are usually a few particularly high and low energy minima which do not possess high symmetry measures. Symmetry constrained sampling was used for all the atomic clusters to find the more elusive high symmetry, high energy minima. In every case the structures were tightly converged and their normal modes calculated to verify that they were true minima rather than saddle points. Most of the distributions appear to be asymmetric, exhibiting longer tails at high energy and suggesting that a log-normal distribution for E as a function of nX might provide a better fit to the data than the normal distribution derived in Ž4.. It is also interesting to note that the distributions for the 55- and 100-atom Lennard-Jones clusters look quite similar, although a magic number icosahedral global minimum exists only for the smaller of these systems. A few of the high and low energy minima are shown in Fig. 2. The high energy structures are rather open when compared to the relatively densely packed global minima, as expected.

4. Conclusions

The above analysis basically rests upon the observation that symmetry, or approximate symmetry, will cause bunching in the distribution of the terms which sum to give the total energy. Even if the average

334

D.J. Walesr Chemical Physics Letters 285 (1998) 330–336

energy is unaffected by symmetry, the variance is expected to be larger for sets of minima with higher symmetry, which we must now define in terms of a measure of the bunching. We therefore expect to find minima with higher exact or approximate symmetry at both the top and the bottom of the distribution for systems with a given composition. Low energy structures are therefore more likely to have higher effective symmetry and lie in a tail of the probability distribution. In this sense a ‘principle of maximal

symmetry’ can operate even when the converse statement, that high symmetry structures have low energy, does not hold. Although this result may not seem profound, it does not appear to have been derived in this way before.

The observation that conformers with low energy and high approximate symmetry lie in a tail of the distribution may help to explain the finding of Yue and Dill that lattice model proteins exhibit relatively few such minima w20x. Li et al. have shown that structural regularities in the latter model are also associated with high ‘designability’ w28x, where many distinct sequences share a common lowest energy structure in terms of the occupied lattice sites. Wolynes found that designability is governed by the sequence averaged free energy function which is minimized for the most likely structures w17x and is directly related to the variance of the potential energy, consistent with the present analysis. It therefore appears that structures with higher approximate symmetry are likely to exhibit extremes of designability as well as potential energy. Analysis of a simple model polypeptide by Nelson et al. w29x also suggests a link between minimally frustrated structures, ‘symmetry’ of the global minimum and fragility to sequence mutations. Hence there is a growing body of evidence that all the above issues are interlinked. A practical definition of approximate symmetry would appear to be a prerequisite for further progress and

Fig. 1. Plots of energy Žhorizontal axis. against nX Žvertical axis. for samples of local minima in several model systems. The global minimum is indicated by an arrow in each case. Parts Ža.–Žd. are for a 13-atom cluster bound by the Morse potential w21x with the range parameter w22x set to r0 s 4, 6, 10 and 14, respectively. For these clusters almost exhaustive samples of minima containing 161, 1467, 9495 and 13182 minima, respectively, have been compiled. Parts Že. and Žf. are for samples of 39809 and 81534 minima for clusters of 55 and 100 atoms bound by the LennardJones potential w23x. Part Žg. is for a sample of 40357 minima for the three-colour, 46-bead polypeptide model w24x. A couple of extremely high energy local minima have been omitted from this plot. Part Žh. is for a sample of 81534 minima for Ag80 with the appropriate Sutton–Chen potential w25x. Part Ži. is for a sample of 45971 minima for ŽH 2O.20 with the TIP4P potential w26x. The resolutions used to produce these figures were Ža. 0.001, Žb. 0.004, Žc. 0.004, Žd. 0.004, Že. 0.004, Žf. 0.004, Žg. 0.00001 eV, Žh. 0.0254 eV and Ži. 4 kJrmol. The pair well depth is the unit of energy for the models in Ža.–Žf..

D.J. Walesr Chemical Physics Letters 285 (1998) 330–336

335

Fig. 2. The lowest Žleft. and two highest energy minima for 55- and 100-atom Lennard-Jones clusters Žtop and middle rows. and a model Ag80 cluster Žbottom row..

the present approach above may be useful in this regard.

Acknowledgements

The author gratefully acknowledges financial support from the Royal Society of London and thanks Dr. Anthony Stone, Dr. Jon Doye and Mr. Mark Miller for their comments on the original manuscript.

References

w1x A.F. Wells, The third dimension in chemistry, Cambridge University Press, Cambridge, 1942.

w2x H.S.M. Coxeter, Regular polytopes, Macmillan, New York, 1963.

w3x R.B. Fuller, E.J. Applewhite, Synergetics: explorations in the geometry of thinking, St. Martins, New York, 1975.

w4x R. Williams, The geometrical foundation of natural structure, Dover, Toronto, 1979.

w5x I. Hargittai, M. Hargittai, Symmetry through the eyes of a chemist, second edition, Plenum Press, New York, 1995.

w6x P. Curie, Oeuvres de Pierre-Curie, Gauthiers-Villars, Paris, 1908, p. 118.

w7x D’Arcy W. Thompson, On growth and form, Cambridge University Press abridged edition, Cambridge, 1961, p. 327.

w8x A. Ceulemans, D. Beyens, L.G. Vanquickenborne, J. Am. Chem. Soc. 106 Ž1984. 5824.

w9x A. Ceulemans, L.G. Vanquickenborne, Structure and Bonding 71 Ž1989. 125.

w10x J. Leech, Math. Gazette 41 Ž1957. 81. w11x D.J. Wales, J.Am. Chem. Soc. 112 Ž1990. 7908. w12x S.-J. Chen, K.A. Dill, J. Chem. Phys. 104 Ž1996. 5964.

w13x P.-A. Lindg˚ard, H. Bohr, Phys. Rev. Lett. 77 Ž1996. 779.

w14x M.E. Kellman, J. Chem. Phys. 105 Ž1996. 2500. w15x K. Yue, K. Dill, Proc. Natl. Acad. Sci. USA 92 Ž1995. 146. w16x F. Crick, J.D. Watson, Nature 177 Ž1956. 473. w17x P.G. Wolynes, Proc. Natl. Acad. Sci. USA 93 Ž1996. 14249. w18x P.W. Fowler, C.M. Quinn, Theor. Chim. Acta 70 Ž1986. 333. w19x H. Zabrodsky, S. Peleg, D. Avnir, J. Am. Chem. Soc. 114

Ž1992. 7843. w20x K. Yue, K.A. Dill, Proc. Natl. Acad. Sci. USA 92 Ž1995.

146. w21x P.M. Morse, Phys. Rev. 34 Ž1929. 57. w22x P.A. Braier, R.S. Berry, D.J. Wales, J. Chem. Phys. 93

Ž1990. 8745. w23x J.E. Jones, A.E. Ingham, Proc. Roy. Soc. A 107 Ž1925. 636.

336

D.J. Walesr Chemical Physics Letters 285 (1998) 330–336

w24x R.S. Berry, N. Elmaci, J.P. Rose, B. Vekhter, Proc. Natl. Acad. Sci. USA 00 Ž1997. 0000.

w25x A.P. Sutton, J. Chen, Phil. Mag. Lett. 61 Ž1990. 139. w26x W.L. Jorgensen, J. Chandrasekhar, J. Madura, J. Chem.

Phys. 79 Ž1983. 926. w27x J.D. Honeycutt, D. Thirumalai, Proc. Natl. Acad. Sci. USA

87 Ž1990. 3526.

w28x H. Li, R. Helling, C. Tang, N. Wingreen, Science 273 Ž1996. 666.

w29x E.D. Nelson, L.F. Teneyck, J.N. Onuchic, Phys. Rev. Lett. 79 Ž1997. 3534.

Symmetry, near-symmetry and energetics

David J. Wales

UniÕersity Chemical Laboratories, Lensfield Road, Cambridge CB2 1EW, UK Received 12 December 1997; in final form 13 January 1998

Abstract

Is there a connection between symmetry and energetics? Surveys of over 300000 local minima from nine different model systems reveal some correlation between the degree of symmetry and the mean energy. However, both theory and calculation suggest that the variance may be more important. Systems with higher symmetry, defined in terms of the spectrum of contributions to the total energy, are more likely to exhibit particularly low and high energy minima. This change in viewpoint may be sufficient to account for the widespread appearance of high symmetry. q 1998 Elsevier Science B.V.

1. Introduction

The aesthetics and mathematics of symmetrical structures have intrigued humankind across the ages, from Pythagoras, Euclid and Plato to da Vinci and Buckminster Fuller w1–5x. The appearance of high symmetry has often been commented upon. Pierre Curie postulated that ‘the symmetry characteristic of a phenomenon is the maximal symmetry compatible with the existence of the phenomenon’ w6x. D’Arcy Thompson wrote that ‘the perfection of mathematical beauty is such . . . that whatsoever is most beautiful and regular is also found to be most useful and excellent’ w7x. However, it is not difficult to construct counterexamples to show that systems do not necessarily exhibit their highest possible symmetry.

The most convincing example of a maximum symmetry result is perhaps the ‘epikernel principle’ of Ceulemans et al., which proposes that when the Jahn–Teller effect operates the resulting symmetry breaking actually maintains as much symmetry as possible w8,9x. One can also prove that so-called

‘balanced structures’ w10x, where all the atoms lie on a rotation axis, are in tangential equilibrium. This result generally means that there is a stationary point of the balanced shape for some particular system size, but the stationary point need not be a minimum w11x.

Recently several authors have referred to ‘symmetries’ exhibited by relatively large biomolecules which cannot be described within the framework of point groups. Chen and Dill have developed an approach based upon knot theory to define and clas-

sify symmetries in compact polymers w12x. Lindg˚ard

and Bohr have found magic numbers for secondary structure in proteins by projecting consecutive secondary structures onto a lattice w13x. Kellman has commented that ‘many globular proteins fold to native states with great regularity in their tertiary structure, reminiscent of symmetry’ w14x. His analysis reveals permutation symmetries in model proteins when the links between the residues on a lattice are ignored. Yue and Dill have suggested that the ubiquity of symmetry arises because symmetric patterns with nondegenerate ground states are easy to design

0009-2614r98r$19.00 q 1998 Elsevier Science B.V. All rights reserved. PII S 0 0 0 9 - 2 6 1 4 Ž 9 8 . 0 0 0 4 4 - X

D.J. Walesr Chemical Physics Letters 285 (1998) 330–336

331

w15x. Kellman’s study supports their view that there is a connection between symmetry and low degeneracy, where in this context degeneracy refers to the number of structures with a similar energy.

Clearly the manifestation of symmetry in biological systems through the duplication of components from the same information can provide an evolutionary advantage. This was the argument used by Watson and Crick in predicting that virus capsids would be made of repeating subunits w16x. There are many other examples of oligomeric biomolecules, including enzymes such as hemoglobin, where the repeated units are in symmetry inequivalent environments and so the overall point group is clearly C1, despite the intuitively symmetric appearance of the molecule. However, Wolynes has recently deduced an approximate connection between the designability of a given heteropolymer structure and the symmetries of a transformed free energy function w17x.

The common association of high symmetry with low energy in the literature and the new results appearing for large biomolecules have been touched upon above. Furthermore, to include the observations for biomolecules we must extend our notion of symmetry beyond point groups. The present contribution addresses both issues. A qualitative argument suggests that high symmetry structures are more likely to have either particularly low or particularly high energies, even when the average energy depends only weakly upon symmetry. Extensive surveys of local minima for a number of model systems generally exhibit the predicted features. The resulting change of viewpoint may be sufficient to explain the prevalence of symmetry, including some insight into the effects of approximate symmetry.

2. The consequences of symmetry and near-symmetry

For any system we can write the potential energy as a series of terms representing the interactions between increasingly large sets of atoms:

E

s

e

Ž0.

q

Ý

e

Ž1. i

q

Ý

Ý

e

Ž2 ij

.

i

j-i i

qÝ

Ý

Ý

e

Ž3. i jk

q

.

.

.

,

Ž1.

k-j j-i i

where, for example, eiŽj2. represents the two-body energy contribution from atoms i and j. If the

molecule has N atoms then the last term in the series

represents the N-body interaction. The above form is

generally applicable, except where quantum mechan-

ical states approach degeneracy, in which case we

must instead write separate series for every term in a

matrix whose eigenvalues then give the allowed

energy levels. We will therefore proceed on the basis that the total energy is a sum of contributions Äet4 resulting from one-body, two-body, etc., interactions. Here Äet4 is the set containing all the various energy contributions eiŽjm. ... in Ž1., ranked in ascending order: e1Fe2Fe3 . . . .

Usually symmetry tells us when integrals repre-

senting quantum mechanical expectation values van-

ish, enabling us to derive spectroscopic selection

rules, for example. However, it is important to re-

member that symmetry can also tell us when parame-

ters such as bond lengths and angles must be the

same. The total energy is invariant under any point

group symmetry operation, Qˆ. Qˆ must permute the

energy contributions Äet4 in such a way that E is unchanged. Hence the Äet4 can be separated into sets which are permuted only amongst themselves and

therefore form representations of the appropriate point group, G. Each set of Äet4 which is mapped

into itself under any operation Qˆg G and cannot be

further subdivided into smaller sets is called an orbit of the point group. For the energy contributions, Äet4, all the members of an orbit have the same numerical

value. The number of members in an orbit must be

equal to the order of a subgroup of G, ranging from one to hG , the order of G itself.

The ordered set of energy contributions, Äet4, may contain exact duplications whenever the system pos-

sesses geometrical symmetry elements. Hence we

may write the total energy as

M

n

E s Ý ei s Ý Ai eiX ,

Ž2.

is1

is1

where M is the total number of members in the set Äet4, which is fixed for a given molecular stoichiometry, and Ýnis1 Ai s M. ÄetX4 is then the ordered set of unique energy contributions, with duplicates re-

moved. The allowed values for the Ai correspond to

332

D.J. Walesr Chemical Physics Letters 285 (1998) 330–336

orbits of the prevailing point group. For example, if

the molecule has point group Ih then the allowed values of Ai are w18x 1, 12, 20, 30, 60 and 120.

In the present analysis we extend the notion of

symmetry to include cases where there are close, but

not exact, degeneracies in the set Äet4. Here we must inevitably introduce a new parameter, r, which cor-

responds to the resolution with which we view the Äet4 distribution. The ÄetX4 set is now constructed by replacing any sequence ej, ejq1, . . . , ejqm, where ejq1 y ej - r, ejq2 y ejq1 - r, . . . , ejqm y ejqmy1 r, with the mean value ejX s Ýiissjjqmeirm. If ejX appears with weight AXj s m then the expression for the total energy is still exact, independent of the resolu-

tion:

nX

E s Ý AXi eiX .

Ž3.

is1

For example, in a purely oligomeric molecule with

point group C1 we would expect many AXi to be

characteristic of the point group appropriate for the

repeated unit, rather than unity, if the resolution is

chosen appropriately. Hence, at the expense of one

new parameter, r, we can incorporate both exact and

near symmetry. In this framework the amount of

symmetry may be quantified either by the mean

value of AXi, ² AX:, Žlarge for high symmetry. or the

number of terms in the sum, nX Žsmall for high

symmetry.. These measures are equivalent because

the total number of energy contributions is fixed for

a

given

system

at

M,

so

that

M

s

nX

Ýis

1

AXi

s

nX²

AX

:.

We note that Avnir and coworkers have also devel-

oped a continuous measure of symmetry w19x, and it

may be possible to connect the two approaches in

future work.

To investigate the relation between symmetry and

energy we must consider the probability distribution

for E as a function of nX. It is only possible to draw

qualitative conclusions because E is the sum of nX

distinct AXi eiX terms, which are not really independent. Furthermore, both AXi and eiX depend upon the

geometry Žand the resolution, r .. To make contact

with thermodynamics we note that the potential en-

ergy makes the most important contribution to the

free energy at low temperature in the canonical

ensemble. The present data would allow approximate

calculations of free energies or entropies, but we will

concentrate on the potential energy here to avoid further approximations.

The mean and variance of the sum of nX variables, xi, drawn at random from the same distribution, PŽ x., are simply nXm and nXs 2, where m and s 2 are the mean and variance of PŽ x.. Eq. Ž3. shows that the energy can be written as nX terms of the form AXi eiX. To a first approximation it is reasonable to assume that the mean value of these terms will scale as 1rnX. This scaling is necessary to give an average total energy that does not vary strongly with nX, as found in the numerical examples below. The variance of the AXi eiX terms is then expected to scale as 1rŽ nX .2, which suggests that the variance of the energy should scale as 1rnX. In other words, even if the expected energy of a given minimum does not depend strongly on symmetry, the variance will. Hence it is not necessary to associate low energy with high symmetry to explain the prevalence of high symmetry. Due to the larger variance we expect to find both significantly higher and lower energy minima amongst structures with higher symmetry measures, as judged by the value of nX. This observation is sufficient to account for the predominance of high symmetry amongst the lowest energy isomers. We also expect the lowest minima to occur in a tail of the probability distribution, whose width increases with increasing symmetry, according to the measure nX. These results may help to explain the observation of Yue and Dill w20x that lattice model proteins exhibit relatively few conformers with low energy and high ‘symmetry’.

A more quantitative result can be obtained by applying the central limit theorem to obtain the distribution of the energy as a function of nX, which assumes that nX is large and that the contributions to E are uncorrelated. The first assumption is reasonable for the systems of interest here, but the second assumption is only an approximation. The resulting probability distribution is:

( P Ž E,nX . s

1

2p nX²Ž AXeX . 2 :nX

² : Ž E y nX² AXeX:nX .2

=exp y

X

XX 2

,

Ž4.

2 n Ž Ae . nX

D.J. Walesr Chemical Physics Letters 285 (1998) 330–336

333

where the averages denoted by angle brackets are for fixed nX. The mean of the distribution is therefore

nX² AXeX:nX and the variance is nX²Ž AXeX . 2 :nX. As above,

we might expect the mean to change only weakly with nX and the variance to scale roughly as 1rnX.

For a system which possesses symmetry in terms of weights, AXi, that are greater than unity, the search for local minima with fixed values of the AXi is a constrained optimization problem. There are bound to be fewer solutions to this problem than when all the atoms are allowed to move independently. Hence we expect the number of local minima on the potential energy surface to decrease with increasing symmetry, where again we measure symmetry in terms

of ² AX: or nX.

3. Numerical results

Nine different model systems were considered. The smallest is a 13-atom cluster bound by the Morse potential. In reduced units there is one free parameter, r0, left which corresponds to the range of the interaction w22x. The results are for almost complete sets of minima at four different values of r0. The six other systems all have astronomically large numbers of local minima for which representative samples were collected. We present results for 55and 100-atom clusters bound by the Lennard-Jones potential, the ‘three-colour, 46-bead’ model polypeptide adapted from the work of Thirumalai w27x by Berry et al. 24, the cluster Ag80 modelled by a many-body Sutton–Chen potential w25x and the ŽH 2O.20 cluster modelled by the rigid body TIP4P potential w26x. The model polypeptide potential contains three- and four-body forces via the bond-angle and dihedral-angle terms, and the Sutton–Chen potential includes pairwise and N-body terms. The remaining potentials are all pairwise additive.

The price for extending the analysis to incorporate the continuous symmetry measure is the resolution parameter, r. If r is too small then only exact degeneracies in the Äet4 set will be detected, while if r is too large then unrelated ei will be lumped together. Neither of these limits is appropriate, and so we expect to find an optimal resolution for each system where the distribution of the local minima

energies as a function of nX best agrees with the theory. For each of the above models we have analyzed the Äet4 over resolutions varying by at least six orders of magnitude and we have also attempted to fit the distributions to various functional forms. Here we will avoid further numerical presentations by simply showing plots of the local minima energies E against nX for the resolution which appears to give the best agreement with the above theory ŽFig. 1..

In general the predicted trends are quite well borne out in Fig. 1, with the widths of the distributions increasing as nX decreases. In judging the quality of this agreement it is important to note that the population of minima at high nX vastly exceeds that at small nX, and that this effect is exacerbated by the difficulty in sampling the sparsely populated high energy regions at small values of nX. Hence it is not surprising that there are usually a few particularly high and low energy minima which do not possess high symmetry measures. Symmetry constrained sampling was used for all the atomic clusters to find the more elusive high symmetry, high energy minima. In every case the structures were tightly converged and their normal modes calculated to verify that they were true minima rather than saddle points. Most of the distributions appear to be asymmetric, exhibiting longer tails at high energy and suggesting that a log-normal distribution for E as a function of nX might provide a better fit to the data than the normal distribution derived in Ž4.. It is also interesting to note that the distributions for the 55- and 100-atom Lennard-Jones clusters look quite similar, although a magic number icosahedral global minimum exists only for the smaller of these systems. A few of the high and low energy minima are shown in Fig. 2. The high energy structures are rather open when compared to the relatively densely packed global minima, as expected.

4. Conclusions

The above analysis basically rests upon the observation that symmetry, or approximate symmetry, will cause bunching in the distribution of the terms which sum to give the total energy. Even if the average

334

D.J. Walesr Chemical Physics Letters 285 (1998) 330–336

energy is unaffected by symmetry, the variance is expected to be larger for sets of minima with higher symmetry, which we must now define in terms of a measure of the bunching. We therefore expect to find minima with higher exact or approximate symmetry at both the top and the bottom of the distribution for systems with a given composition. Low energy structures are therefore more likely to have higher effective symmetry and lie in a tail of the probability distribution. In this sense a ‘principle of maximal

symmetry’ can operate even when the converse statement, that high symmetry structures have low energy, does not hold. Although this result may not seem profound, it does not appear to have been derived in this way before.

The observation that conformers with low energy and high approximate symmetry lie in a tail of the distribution may help to explain the finding of Yue and Dill that lattice model proteins exhibit relatively few such minima w20x. Li et al. have shown that structural regularities in the latter model are also associated with high ‘designability’ w28x, where many distinct sequences share a common lowest energy structure in terms of the occupied lattice sites. Wolynes found that designability is governed by the sequence averaged free energy function which is minimized for the most likely structures w17x and is directly related to the variance of the potential energy, consistent with the present analysis. It therefore appears that structures with higher approximate symmetry are likely to exhibit extremes of designability as well as potential energy. Analysis of a simple model polypeptide by Nelson et al. w29x also suggests a link between minimally frustrated structures, ‘symmetry’ of the global minimum and fragility to sequence mutations. Hence there is a growing body of evidence that all the above issues are interlinked. A practical definition of approximate symmetry would appear to be a prerequisite for further progress and

Fig. 1. Plots of energy Žhorizontal axis. against nX Žvertical axis. for samples of local minima in several model systems. The global minimum is indicated by an arrow in each case. Parts Ža.–Žd. are for a 13-atom cluster bound by the Morse potential w21x with the range parameter w22x set to r0 s 4, 6, 10 and 14, respectively. For these clusters almost exhaustive samples of minima containing 161, 1467, 9495 and 13182 minima, respectively, have been compiled. Parts Že. and Žf. are for samples of 39809 and 81534 minima for clusters of 55 and 100 atoms bound by the LennardJones potential w23x. Part Žg. is for a sample of 40357 minima for the three-colour, 46-bead polypeptide model w24x. A couple of extremely high energy local minima have been omitted from this plot. Part Žh. is for a sample of 81534 minima for Ag80 with the appropriate Sutton–Chen potential w25x. Part Ži. is for a sample of 45971 minima for ŽH 2O.20 with the TIP4P potential w26x. The resolutions used to produce these figures were Ža. 0.001, Žb. 0.004, Žc. 0.004, Žd. 0.004, Že. 0.004, Žf. 0.004, Žg. 0.00001 eV, Žh. 0.0254 eV and Ži. 4 kJrmol. The pair well depth is the unit of energy for the models in Ža.–Žf..

D.J. Walesr Chemical Physics Letters 285 (1998) 330–336

335

Fig. 2. The lowest Žleft. and two highest energy minima for 55- and 100-atom Lennard-Jones clusters Žtop and middle rows. and a model Ag80 cluster Žbottom row..

the present approach above may be useful in this regard.

Acknowledgements

The author gratefully acknowledges financial support from the Royal Society of London and thanks Dr. Anthony Stone, Dr. Jon Doye and Mr. Mark Miller for their comments on the original manuscript.

References

w1x A.F. Wells, The third dimension in chemistry, Cambridge University Press, Cambridge, 1942.

w2x H.S.M. Coxeter, Regular polytopes, Macmillan, New York, 1963.

w3x R.B. Fuller, E.J. Applewhite, Synergetics: explorations in the geometry of thinking, St. Martins, New York, 1975.

w4x R. Williams, The geometrical foundation of natural structure, Dover, Toronto, 1979.

w5x I. Hargittai, M. Hargittai, Symmetry through the eyes of a chemist, second edition, Plenum Press, New York, 1995.

w6x P. Curie, Oeuvres de Pierre-Curie, Gauthiers-Villars, Paris, 1908, p. 118.

w7x D’Arcy W. Thompson, On growth and form, Cambridge University Press abridged edition, Cambridge, 1961, p. 327.

w8x A. Ceulemans, D. Beyens, L.G. Vanquickenborne, J. Am. Chem. Soc. 106 Ž1984. 5824.

w9x A. Ceulemans, L.G. Vanquickenborne, Structure and Bonding 71 Ž1989. 125.

w10x J. Leech, Math. Gazette 41 Ž1957. 81. w11x D.J. Wales, J.Am. Chem. Soc. 112 Ž1990. 7908. w12x S.-J. Chen, K.A. Dill, J. Chem. Phys. 104 Ž1996. 5964.

w13x P.-A. Lindg˚ard, H. Bohr, Phys. Rev. Lett. 77 Ž1996. 779.

w14x M.E. Kellman, J. Chem. Phys. 105 Ž1996. 2500. w15x K. Yue, K. Dill, Proc. Natl. Acad. Sci. USA 92 Ž1995. 146. w16x F. Crick, J.D. Watson, Nature 177 Ž1956. 473. w17x P.G. Wolynes, Proc. Natl. Acad. Sci. USA 93 Ž1996. 14249. w18x P.W. Fowler, C.M. Quinn, Theor. Chim. Acta 70 Ž1986. 333. w19x H. Zabrodsky, S. Peleg, D. Avnir, J. Am. Chem. Soc. 114

Ž1992. 7843. w20x K. Yue, K.A. Dill, Proc. Natl. Acad. Sci. USA 92 Ž1995.

146. w21x P.M. Morse, Phys. Rev. 34 Ž1929. 57. w22x P.A. Braier, R.S. Berry, D.J. Wales, J. Chem. Phys. 93

Ž1990. 8745. w23x J.E. Jones, A.E. Ingham, Proc. Roy. Soc. A 107 Ž1925. 636.

336

D.J. Walesr Chemical Physics Letters 285 (1998) 330–336

w24x R.S. Berry, N. Elmaci, J.P. Rose, B. Vekhter, Proc. Natl. Acad. Sci. USA 00 Ž1997. 0000.

w25x A.P. Sutton, J. Chen, Phil. Mag. Lett. 61 Ž1990. 139. w26x W.L. Jorgensen, J. Chandrasekhar, J. Madura, J. Chem.

Phys. 79 Ž1983. 926. w27x J.D. Honeycutt, D. Thirumalai, Proc. Natl. Acad. Sci. USA

87 Ž1990. 3526.

w28x H. Li, R. Helling, C. Tang, N. Wingreen, Science 273 Ž1996. 666.

w29x E.D. Nelson, L.F. Teneyck, J.N. Onuchic, Phys. Rev. Lett. 79 Ž1997. 3534.