Basis Sets Used in Molecular Orbital Calculations

Split-valence basis sets I: the 3-21G basis set

The description of the valence electrons can be significantly improved over that in the minimal STO-3G basis set, if more than one basis function is used per valence electron. Basis sets of this type are called "split valence" basis sets as the description of valence orbitals are split into two (or more) basis functions. A related term is "double zeta" in reminiscence of the greek symbol used for the orbital exponents of STOs. The term "double zeta" (DZ) does not imply, however, whether two basis sets are use for all of the orbitals or only for the valence space.

One very economical, small split valence basis set is the 3-21G basis set.^2-4 The non-valence electrons are described by single basis functions composed of a contraction of three Gaussians. Each valence electron is described by two basis functions. The first of these basis functions is composed of two Gaussian primitives while the second consists of a single uncontracted Gaussian primitive.

For carbon the 3-21G basis set is² (Gaussian 94 format):

C 0

S 3 1.00

.1722560000D+03 .6176690000D-01

.2591090000D+02 .3587940000D+00

.5533350000D+01 .7007130000D+00

SP 2 1.00

.3664980000D+01 -.3958970000D+00 .2364600000D+00

.7705450000D+00 .1215840000D+01 .8606190000D+00

SP 1 1.00

.1958570000D+00 .1000000000D+01 .1000000000D+01

%Kjob L301

#P HF/3-21G GFInput GFPrint

methanol basis set

0,1

C1
O2  1  r2
H3  1  r3  2  a3
H4  1  r4  2  a4  3  d4

r2=1.20
r3=1.0
r4=1.0
a3=120.
a4=120.
d4=180.

Kjob command kills the job after checking the input

The GFInput (“Gaussian Function Input”) output generation keyword causes the current basis set to be printed in a form suitable for use as general basis set input, and can thus be used in adding to or modifying standard basis sets.

GFPrint command: This output generation keyword prints the current basis set in tabular form.

The main difference to the STO-3G listing before is that there are two sections now for the 2s and 2p orbitals. Again, s and p orbitals share the same exponents a_2,x in the first column, but have different contraction coefficients listed in the second and third column. The contraction coefficient of the outer SP shell is, of course, unity as the outer SP shell consists of only one Gaussian function. Inspection of the orbital exponents of the first SP shell (3.66 and 0.77) and the second SP shell (0.19) shows that the first SP shell has larger exponents and therefore describes electron density closer to the nucleus as compared to the second SP shell. Consequently, the first and second SP shells are sometimes referred to as the "inner" and "outer" shells, respectively.

The 3-21G basis set has somewhat of a curious developmental history. Straightforward variation of the orbital exponents and expansion coefficients as practiced in the development of other basis sets led to an "falling inward" of the inner valence shells. This is due to the fact that there are only three Gaussians available for the description of the core region and that addition of more primitives to the core region lowers the overall energy more than an adequate description of the valence shell. For most elements the basis set parameters were therefore first optimized using a much larger core space of six Gaussians (6-21G). At this stage all basis set parameters were varied to minimize the energy of the atoms in their electronic ground states at the unrestricted Hartree-Fock (UHF) level of theory. The large core was then replaced by a smaller one of only three Gaussians and the basis set parameters of the new core region were reoptimized, keeping the valence region constant. A similar strategy has also been used for second row elements.^3,4

How much more computational effort is required for the 3-21G basis set as compared to the minimal STO-3G solution? Each carbon and oxygen atom in methanol now needs 9 and each hydrogen atom needs 2 basis functions. For methanol this equates to 26 instead of 14 basis functions. This much larger number of basis functions and thus variable MO coefficients is achieved, however, with the same number of Gaussian primitives as for the STO-3G basis set: 42.

Split-valence basis sets II: the 6-31G basis sets

Development of the larger 4-31G and 6-31G split valence basis sets^5,6 predates that of the 3-21G basis set considerably. The main difference between 3-21G and 6-31G is that a much larger number of primitives is used in the latter in the core as well as the inner most valence shell. The use of a contraction of six Gaussian primitives for each core orbital improves the description of the core region significantly. The valence region is again described by two basis functions per atomic orbital. The inner shell is composed of a contraction of three Gaussians and the outer shell consists of one single Gaussian primitive. As in other basis sets developed by the Pople group, s and p functions share common exponents.

The 6-31G basis for carbon is:

C 0

S 6 1.00

.3047524880D+04 .1834737130D-02

.4573695180D+03 .1403732280D-01

.1039486850D+03 .6884262220D-01

.2921015530D+02 .2321844430D+00

.9286662960D+01 .4679413480D+00

.3163926960D+01 .3623119850D+00

SP 3 1.00

.7868272350D+01 -.1193324200D+00 .6899906660D-01

.1881288540D+01 -.1608541520D+00 .3164239610D+00

.5442492580D+00 .1143456440D+01 .7443082910D+00

SP 1 1.00

.1687144782D+00 .1000000000D+01 .1000000000D+01

The orbital exponents and expansion coefficients were optimized to yield the lowest possible UHF energies for the respective atoms in their electronic ground states. The exponents of the valence shell atoms have then been scaled uniformly using scale factors developed for the 4-31G basis set in order to achieve the best possible results in MO calculations on a set of small organic molecules.

For methanol, the 6-31G basis set includes 26 basis functions, which are composed of a total of 60 Gaussian primitives.

Double zeta basis sets: Dunnings D95 basis set

Dunnings D95 basis set has been derived from an already existing large atomic basis set of nine uncontracted Gaussian primitives of s- and five uncontracted Gaussian primitives of p-type.⁸ Six of the nine s-type functions have then been grouped into a single contraction, while the other three s-type functions have been left alone. Similarly, four of the five p-type functions have been contracted into a single function, while one function was left uncontracted. Overall, this yields a basis set of four s-type and two p-type basis functions. In contrast to the split valence basis sets discussed before, the D95 basis set is a full double zeta basis set in that it allocates two basis functions for each atomic orbital of the core as well as the valence region occupied in the electronic ground state.

The D95 basis set for carbon is:

C 0

S 6 1.00

.4232610000D+04 .2029000000D-02 |

.6348820000D+03 .1553500000D-01 |

.1460970000D+03 .7541100000D-01 |

.4249740000D+02 .2571210000D+00 |

.1418920000D+02 .5965550000D+00 | -- s₁ + s₂ mainly describe

.1966600000D+01 .2425170000D+00 | the 1s core of carbon

S 1 1.00 |

.5147700000D+01 .1000000000D+01 |

S 1 1.00 |

.4962000000D+00 .1000000000D+01 | -- s₃ + s₄ mainly describe

S 1 1.00 | the 2s valence orbital of carbon

.1533000000D+00 .1000000000D+01 |

P 4 1.00

.1815570000D+02 .1853400000D-01

.3986400000D+01 .1154420000D+00

.1142900000D+01 .3862060000D+00

.3594000000D+00 .6400890000D+00

P 1 1.00

.1146000000D+00 .1000000000D+01

The format used here for the listing of orbital exponents and expansion coefficients is different from those used before for the Pople basis sets as s- and p-type functions of the D95 basis set do not share the same orbital exponents.

Using an uncontracted atomic basis set as the starting point for the development of contracted versions suitable for the treatment of larger systems is common practice. The standard nomenclature used specifies the uncontracted basis set in brackets and the resulting contracted version in square brackets. Using the D95 basis set as an example, the contraction can be described as (9s5p) -> [4s,2p]. This notation does not specify, how many primitives are contained in each contraction. This can be specified in more detail as (6111,41), listing first the s-type functions (here distributed over four contractions) and then the p-type functions. Basis sets, in which a given primitive appears in only one of the contractions, are termed segmented.

Why are contractions done? MO calculations using the uncontracted 9s5p atomic basis set would need to handle 9*1+5*3=24 MO coefficients for each carbon atom while only 4*1+2*3=10 MO coefficients are necessary for the [4s,2p] contraction. If proper care is taken during the contraction process, calculations using the contracted basis sets can be performed with similar accuracy but dramatically reduced computational cost.

For methanol, the D95 basis set uses 10 basis functions for each carbon and oxygen atom, and two functions for hydrogen, yielding a total of 28 basis functions. These basis functions are constructed from a total of 64 Gaussian primitives.