ChargeCalculator:FAQ

Contents

What is the physical meaning of atomic charges?

Atomic charges, or atomic partial charges, are non-integer numbers quantifying the balance of positive (nuclear) charge and negative (electronic) charge associated with each atom. In the 3D space, atomic charges represent points placed at the position of the atomic nuclei, and may be termed atomic point charges. The molecular representation based on atomic point charges is thus a very basic abstraction of the molecular electron density.

Atomic charges are conceived to reflect the uneven distribution of electron density in the molecule. While atomic charges are merely concepts and not physical observables, they have been used heavily in theoretical and applied chemistry due to their highly intuitive character and correlation with measurable quantities such as the electrostatic potential, polarity, reactivity, etc. Nowadays, atomic charges are still integral parts of many modeling applications, and are still used in reasoning basic chemical processes.

When employing atomic charges, you must be aware of the limitations inherent to the atomic point charge model. A single number can give an idea about whether there is more electron density around some atoms compared to others, but it cannot characterize the actual distribution of electron density in the space between the atomic nuclei. Thus, all properties which flow from this distribution (such as multipole moments) are generally not well described using atomic charges.

There are more NMR states (models) in my file. Can I run ACC on all, or just a few selected states?

By default, only the first NMR model is loaded from files with multiple models annotated as such. The same holds true for multiple molecules in sdf format - only the first molecule will be loaded. However, ACC can run on any number of molecules at a time, as long as each molecule is uploaded in a separate file.

Therefore, if your input file contains more NMR states, you must first separate your initial file into multiple files, each containing a single NMR state of interest to you. Compress these files as .zip. Upload the .zip archive with all models into ACC, and you can compute atomic charges for all models in a single ACC run.

For example, say you have a .pdb file containing 15 NMR models, and you wish to run ACC for models 1-5. Copy the records belonging to model 1 into a file called model1.pdb. Then the records belonging to model 2 into a file called model2.pdb, and so on. Put these 5 files with unique names into a .zip archive, which you can then upload into ACC. Once you upload, you will see that ACC has detected each of the models separately.

After uploading my molecule, I got a warning about "missing hydrogen atoms". How will it affect my calculation?

EEM, the empirical approach used by ACC to calculate atomic charges, produces atomic charges which respond to changes in conformation and chemical environment. In order to produce chemically relevant atomic charges using EEM, it is necessary that the structure of the molecule be complete. All protons should be present according to the relevant protonation state. Since ACC does not currently include functionality for editing the molecular structure, you must address these issues prior to uploading the molecule into ACC. For example, you may use a server like pdb2pqr to assign protonation states, add protons and subsequently estimate the total molecular charge.

ACC produces a missing H warning if no H are found in the input file (note that the warning does not appear if at least one H is present in the input file). Despite the missing H warning, ACC allows to proceed with the charge calculation step, as it might not always be possible to obtain a perfect structure (e.g., when working with low resolution structures of extremely large complexes). The results from such calculations may not have chemical meaning in their absolute values, but they can be very useful when comparing sets of charges (open vs closed conformation, free vs bound state, etc.).

After uploading my molecule, I got a warning about "unknown chemical element names". How will it affect my calculation?

EEM, the empirical approach used by ACC to calculate atomic charges, operates with atom types based on chemical elements. When uploading the input file, ACC needs to establish the chemical element of each atom, so that it can assign a suitable atom type, and subsequently EEM parameters. ACC expects to find the chemical element at a pre-defined position in the input file, which depends on the formal guidelines established for each file format. For example, in .pdb files, ACC looks for the chemical element in the column after occupancy and temperature factor (positions 77-78).

ACC holds a predefined list of chemical elements from the periodic table of chemical elements. If ACC does not recognize the chemical element for an atom at the expected position in the input file, it will not include this atom in the atomic charge calculation because it cannot assign an atom type, and therefore EEM parameters. This means that the calculation will run only on the remaining atoms, and when you view the results the atomic charge value for the atoms with unknown chemical elements will be "NaN". If none of the chemical elements are recognized (e.g., the file format guidelines are not respected, or the input file comes from a modeling program which uses the element column to store its own atom types), you will get an error for the entire calculation. Finally, if the input file comes from a modeling program which uses the element column to store its own atom types, and these atom types overlap with known chemical elements, it is possible that ACC includes the atoms in the calculation, but it assigns wrong atom types. For example, if "Ca" appears in the element column, ACC will interpret it as calcium even if it was originally meant as C-alpha.

You can circumvent these issues and make sure these atoms are included in the calculation correctly either by fixing the input file, or using a custom EEM parameter set. The first solution is to adjust the element column in the input file so that it really displays chemical elements. Additionally, make sure the input file format follows the formal guidelines. The second solution is to use an EEM parameter set in which you include EEM parameters for the atoms with unknown chemical elements. By doing so you actually define new atom types and corresponding EEM parameters. For example, say your input file contains hydrogen atoms identified in the element column as "H1" and "Ho", depending on their binding partner. ACC will report them as unknown chemical element names warnings. You may create a new EEM parameter set based on one of the built-in sets already available in ACC (see below how to do that). Copy/paste the parameter information for hydrogen (all text enclosed in the Element tags) twice more into this new parameter set. Then change the Element name tag in one case to "H1", and in the other to "Ho". Save the new set with a unique name and select it from the list. When you start your computation, the atoms with the previously unknown element names "H1" and "Ho" will now be included in the calculation, and treated by EEM parameters suitable for hydrogen.

After uploading my molecule, I got the warning "Atoms in the residue contain multiple names". How will it affect my calculation?

This generally occurs if the chain ID is not explicitly included in the input file, but the molecule contains multiple chains with overlapping residue serial numbers. Since no chain IDs are available, ACC assumes that everything belongs to one chain. When it reads atoms with a residue serial number that has already been loaded, it basically overwrites the composition of the residue with that serial number. Consequently, the computation will run, but the results will not be meaningful for the affected residues, and possibly even for neighboring residues.

ACC provides check chain ID warnings both before and after the computation if this problem is detected, so that the input file can be corrected. Depending on how you generated the input file, this problem may or may not occur. For example, if your structure has multiple chains, and you plan to use the pdb2pqr server to add H to your structure and save it in .pqr format, you must remember to tick the option Add/Keep chain IDs in the pqr file in order to produce correct output.

All of the built-in EEM parameter sets report warnings about Missing Atoms. How does it affect my calculation?

EEM, the empirical approach used by ACC to calculate atomic charges, operates with atom types based on chemical elements. When uploading the input file, ACC needs to establish the chemical element of each atom, so that it can assign a suitable atom type, and subsequently EEM parameters. ACC expects to find the chemical element at a pre-defined position in the input file, which depends on the formal guidelines established for each file format.

It may happen that ACC does not recognize the chemical element (see above why), or there are no EEM parameters associated with that chemical element. In this case, the atom in question will not be included in the atomic charge calculation. This means that the calculation will run only on the remaining atoms, and when you view the results the atomic charge value for the atoms for which there were no EEM parameters will be "NaN". If none of the atoms can be assigned suitable EEM parameters, you will get an error for the entire calculation.

If none of the built-in EEM parameter sets contains parameters for all the atoms in your input molecule, you can circumvent this issue and make sure all atoms are included in the calculation by using a custom EEM parameter set in which you manually include EEM parameters for atoms listed as Missing Atoms. By doing so you actually define new atom types and corresponding EEM parameters. For example, say your input file contains phosphorus. ACC will report Missing Atoms. You may create a new EEM parameter set based on one of the built-in sets already available in ACC (see below how to do that). Copy/paste the parameter information for one of the atoms already present (all text enclosed in the Element tags) once more into this new parameter set. It is good to choose some atom that has similar chemical properties to phosphorus - especially electronegativity and hardness (S would probably be the best choice if available). Change the Element name tag to "P". Inside each Bond Type tag, you will find the values for parameters A and B. You may keep these values, or modify them (see below how to do that). Save the new EEM parameter set with a unique name and select it from the list. When you start your computation, the phosphorus atoms will now be included in the calculation, although the EEM parameters might not be optimal.

How do I choose a suitable EEM parameter set?

EEM, the empirical method used by ACC to calculate atomic charges, relies on empirical parameters. Many EEM parameter sets have been published in literature. They are available in ACC as built-in sets, each having a unique identifier. By default, ACC tries to suggest some suitable set of EEM parameters based on the type and atomic composition of the molecule you loaded. If ACC is unable to perform this default selection, or you feel the default choice is not optimal, you will need to make sense of the various sets available.

You will notice that the table with parameter sets is organized first according to the target. This is because the applicability domain of a given EEM parameter set is generally limited to the target molecules. Therefore, in general, one should prefer an EEM parameter set which is meant for the type of molecules of interest. This is not an absolute rule. In fact, we have observed that some EEM parameter sets developed for biomolecules perform very well for organic molecules as well. However, the opposite is not true. Therefore, it's better to choose an EEM parameter set according to the type of input molecule.

Target
Description: type of molecules that are likely to be well described using a specific set of EEM parameters
Possible values: organic molecules, drug-like molecules, proteins, etc.

Another requirement is that the set cover all atom types in the input file. This means that all chemical elements present in the input molecule should be on the list of Atoms, and nothing should be listed at Missing atoms. If all built-in sets report Missing atoms, you will probably have to Add a new EEM parameter set where all necessary EEM parameters are provided (see below how).

Atoms
Description: List of atom types covered by the EEM parameter set. Depends on the type of molecules used to produce reference data during the development of the EEM parameters.
Possible values: H, C, N, O, Cl, etc.
Missing Atoms
Description: List of atom types present in the input file but not covered by the EEM parameter set. These atoms will not be included in the atomic charge calculation using this EEM parameter set.

Further, one should consider the approach used during the development of the parameters. EEM parameters are generally developed based on reference quantum mechanical (QM) calculations. A QM calculation is characterized by the setup of the wave function calculation (theory level, basis set, environment), and the type of observables that will be calculated and interpreted. Most commonly, QM reference data used for the development of EEM parameters consists of atomic charges, which are derived from the observable electron density according to a specific charge definition, meaning a procedure used to partition the molecular electron density, or to deduce the electrostatic contribution of each atom. Because atomic charges are not physical observables and have only a conceptual character, there is no unique charge definition that is universally accepted. Rather, a score of charge definitions have been published and are in use, each with their own strengths and weaknesses.

We denote as approach any association of a QM calculation setup and charge definition.

Approach: QM Method, Basis Set, Population Analysis
Description: association of a QM calculation setup and charge definition. Gives the nature of the reference QM data used during the development of the EEM parameters. The applicability domain of an EEM parameter set is closely related to the applicability domain of the reference QM data.
Possible values:
  • QM Method - level of theory used to solve Schrödinger's equation - HF, B3LYP, etc.
  • Basis Set - set of basis functions used to solve Schrödinger's equation - 6-31G*, STO-3G, etc.
  • Population Analysis - charge definition used after solving Schrödinger's equation to partition the molecular electron density, or to deduce the electrostatic contribution of each atom - MPA (Mulliken population analysis), NPA (Natural population analysis), MK (Merz-Kollman scheme for fitting to electrostatic potentials), etc.

The applicability domain and maximum expected accuracy of an EEM parameter set is closely related to the corresponding QM charges obtained by that particular approach. The maximum accuracy for a particular application of any set of EEM parameters is given by the charge definition used during its development. Therefore, if available, pick an EEM parameter set with a higher level of theory, and most importantly a charge definition suitable for the subsequent application of the atomic charges (what you have in mind to do with the charges). For example, pick MK charges if you plan to run simulations, NPA charges if you plan to interpret reactivity, MPA charges if you plan to do QSPR, etc.

The performance of a given EEM parameter set is further influenced by the procedure used when fitting the EEM parameters to the reference data (size and nature of the QM reference dataset, fitting and optimization algorithms, etc.). Sets with higher training set size should theoretically be more robust. The data source should refer to molecules of the same type as your molecule of interest.

Training Set Size, Data Source
Description: Number and type of molecules used to produce reference data during the development of the EEM parameters.

Finally, to help you make decisions faster, we have included a very basic grading system in the form of the priority descriptor given for each EEM parameter set. When in doubt, pick a parameter set with a low value of the priority descriptor (1,2..).

Priority
Description: Very basic grading system. Serves mainly to identify a suitable default setup. Currently curated manually.
Possible values: For EEM parameter sets focused on biomolecules, priorities are assigned based on their performance in the external validation stage of their development. For sets focused on organic molecules, priorities are assigned based on year of publication, level of theory of the QM reference data, and the results of a small in-house QM benchmark on paracetamol. Lower values are preferred.

All in all, it is always good to try several parameter sets, and draw conclusions based on the trends observed in most sets of results. The EEM implementations in ACC are all computationally efficient, so running calculations with multiple parameter sets is not a problem.

How do I read the XML file with EEM parameters?

EEM, the empirical method used by ACC to calculate atomic charges, relies on empirical parameters. Many EEM parameter sets have been published in literature, and are available in ACC as built-in sets. EEM parameter sets are stored in XML format, where the information is organized using tags. Each type of useful information is marked by a start and an end tag, may have attributes and may contain sub-elements or text.

Each EEM parameter set is marked by the ParameterSet tag and the attribute Name, which encodes a unique identifier. Further, each EEM parameter set is described by properties marked by the Properties tag, which provide the literature reference and basic information about the development of the EEM parameters. Please read the section on how to choose a suitable EEM parameter set in order to better understand the importance of the information given by Properties.

Next, the EEM parameters are given under the tag Parameters, which has three attributes. Target and Priority are important for choosing a suitable EEM parameter set. The attribute Kappa is actually a special EEM parameter which, conceptually, modulates the electrostatic interaction of each atom with the surrounding charges.

<ParameterSet Name="Bult2002_npa">

 <Properties>
   <Property Name="Author">Bultinck, P., Langenaeker, W., Lahorte, P., De Proft, F., Geerlings, P., Waroquier, M., Tollenaere, J. P.</Property>
   <Property Name="Publication">The Electronegativity Equalization Method II: Applicability of Different Atomic Charge schemes</Property>
   <Property Name="Journal">J. Phys. Chem. A, 106, 7895-7901</Property>
   <Property Name="Year">2002</Property>
   <Property Name="Target">Organic molecules</Property>
   <Property Name="QM Method">B3LYP</Property>
   <Property Name="Basis Set">6-31G*</Property>
   <Property Name="Population Analysis">NPA</Property>
   <Property Name="Training Set Size">138</Property>
   <Property Name="Data Source">Not Specified</Property>
   <Property Name="Priority">4</Property>
 </Properties>
 <UnitConversion KappaFactor="0.529177249000" ABFactor="0.036749309000" />
 <Parameters Target="Atoms" Priority="0" Kappa="1.000000000000">
   <Element Name="C">
     <Bond Type="1" A="8.490000000000" B="18.300000000000" />
     <Bond Type="2" A="8.490000000000" B="18.300000000000" />
     <Bond Type="3" A="8.490000000000" B="18.300000000000" />
   </Element>
   <Element Name="F">
     <Bond Type="1" A="39.180000000000" B="88.200000000000" />
   </Element>
   ...
 </Parameters>
</ParameterSet>

The rest of the EEM parameters operate with atom types based on chemical elements, and are marked by the Element tags. However, some EEM parameter sets available in literature employ atom types which depend not only on chemical element, but also on the maximum bond multiplicity. In such EEM parameter sets there are, for example, different EEM parameters for carbon atoms with sp³ hybridization, than for carbon atoms with sp² hybridization. In order to keep a consistent scheme of storing and assigning parameters, ACC implements by default an EEM parameter scheme which supports bond information via the Bond tag. Thus, for each chemical element there will be one Element tag, and at least one Bond tag. EEM parameter sets which are based solely on chemical elements and no bond information contain multiple Bond tags as well, but the parameters associated with different bond multiplicities (the attribute Type) are actually merely copies, as seen in the example above.

The attribute Type encodes the maximum bond multiplicity. In general, sp³ hybridization is encoded as Type=1, sp² hybridization as Type=2, and sp hybridization as Type=3. These values might seem unintuitive, but they are based on connectivity information from the input file, or computed based on interatomic distances. Type=0 encodes a coordinated metal ion.

The actual EEM parameters are encoded in the attributes A and B, conceptually related to electronegativity and hardness, respectively.

Note also the Unit conversion tag used to unify the values of the parameters from different sources in literature. The attribute KappaFactor is the correction applied to the Kappa parameter, while the attribute ABFactor is the correction applied to the values of parameters A and B. In the example above, the electronegativity and hardness and parameters (A and B, respectively) were originally given in eV, while the atomic distances were given in atomic units. ACC uses parameters A and B in relative units (Hartrees), and atomic distances in Angstroms. The conversion factors are:

1 Hartree = 27.2114 eV
1 atomic unit = 0.529177249 Angstroms

Therefore, the following corrections will be applied to the EEM parameters. The Kappa parameter will be corrected by a KappaFactor of 0.529177249. The A and B parameters will be corrected by an ABFactor of 0.036749309 (1/27.2114).

How do I add EEM parameters if they do not exist?

EEM, the empirical approach used by ACC to calculate atomic charges, operates with atom types primarily based on chemical elements. When uploading the input file, ACC seeks to establish the chemical element of each atom, so that it can assign a suitable atom type, and subsequently EEM parameters. If ACC does not recognize the chemical element for an atom, it will report an unknown chemical element name warning. If ACC recognizes the chemical element but cannot find any EEM parameters associated with it, it will report Missing Atoms. Either way, ACC will not include this atom in the atomic charge calculation.

To make sure that even such problematic atoms are included in the calculation, you must use a custom EEM parameter set. At this point, if you have not done so till now, please have a look at how to read the XML file with EEM parameters before you proceed.

First, in the table with EEM parameter sets, click on one of the suitable built-in parameter sets already available. Basic information about the set will be displayed in the panel to the right of the table. Then click the black View XML button at the top of this panel with information. A new tab will open in your browser where the content of the EEM parameter set will be displayed in XML format. Copy the content of this page, then return to the ACC Setup page. Click the black Add button on top of the table with parameter sets. A small window will open, where you select all the content (CTRL+A) and replace it with the content of the parameter set you previously viewed and copied (CTRL+V).

You now have a copy of the built-in EEM parameter set. The idea is that you will need to actually define a new atom type and corresponding EEM parameters for the chemical element that is unknown or for which no EEM parameters are available. Pick a chemical element that has similar chemical properties (especially electronegativity and hardness) to the new chemical element you wish to introduce. Copy the whole content between the start and closing Element tags, and then paste this content inside the Parameters tag, but outside other Element tags. Now modify the attribute Name to the chemical element you wish to include. Then, inside each Bond tags, you will find the values for parameters A and B. You may keep these values, or modify them. You may repeat this procedure for as many new chemical elements as you need.

Save the new EEM parameter set with a unique name. ACC will validate the syntax and let you know if any further modifications are necessary. When you return to the ACC Setup page, select your new parameter set from the table. When you start your computation, the atoms with previously unknown chemical elements or missing EEM parameters will now be included in the calculation, although the EEM parameters might not be optimal.

Note that you need not copy/paste content and make modifications as described here. You can directly use the XML form provided in the window which opens when you click the Add button, as long as you respect the XML syntax.

Can I combine EEM parameters originating from different sets?

Suppose you have a biomacromolecule in complex with a drug inhibitor. You may be wondering if it is necessary, or at least possible, to use different EEM parameter sets for the two molecules. From our experience, it is sufficient to employ a parameter set with the attribute target=biomolecules for the entire complex. Alternatively, you may want to use a certain set suitable for your type of molecule, but which misses EEM parameters for one atom type, and such parameters are available in a different set.

If you really want to combine EEM parameters which originate from different sets, there is a way. In the input file, identify the chemical element column. For all atoms that you would like to treat differently, make some modification to the chemical element. This way, for example, the chemical elements of all ligand atoms can be changed from H, C, N, O, Cl into Hx, Cx, Nx, Ox and Clx. Then upload the molecule. ACC will report an unknown chemical element names warning for the atoms whose chemical elements you have modified. All you need to do now is create a custom parameter set as a copy of one parameter set, with additional Element tags from a second parameter set.

It is very important to take note of the kappa parameter. It should be comparable in the two sets you want to merge. If this condition is not satisfied, you might have to alter the values of the new parameters before they can perform effectively with a different set. In general, when trying out such merges it is good to run a few parallel calculations in which you progressively change the value of the parameters in question, and inspect the resulting atomic charges. Try to look for intervals of the parameter space for which the trends in charges remain the same.

The calculation ran, but I got a warning that some atoms were skipped.

EEM, the empirical approach used by ACC to calculate atomic charges, operates with atom types based on chemical elements. When uploading the input file, ACC needs to establish the chemical element of each atom, so that it can assign a suitable atom type, and subsequently EEM parameters. If ACC does not recognize the chemical element (see above why), or no EEM parameters are available for that chemical element, ACC cannot assign EEM parameters to that atom and it will thus be unable to include it in the EEM calculation.

In the final results, the charge for such atoms that are skipped will appear with the value "n/a" in the web interface. In the downloadable results, the charge for these atoms will appear as 0.000 in the .mol2 and .pqr files, and as "-" in the .csv files. The .wprop files will not list these atoms at all. These values will not contribute to any of the statistics computed by ACC.

The calculation ran, but I did not obtain any charges.

EEM, the empirical approach used by ACC to calculate atomic charges, operates with atom types based on chemical elements. When uploading the input file, ACC needs to establish the chemical element of each atom, so that it can assign a suitable atom type, and subsequently EEM parameters. If ACC does not recognize the chemical element (see above why), or no EEM parameters are available for that chemical element, ACC cannot assign EEM parameters to that atom and it will thus be unable to include it in the EEM calculation. If this happens with all the atoms in the input file, then no values of atomic charges will be available in the results. In such situations, ACC will report an error.

The calculation ran, but I got the warning "Missing parameters for symbol ... and multiplicity .... Using value for multiplicity ... instead."

EEM, the empirical approach used by ACC to calculate atomic charges, operates with atom types based on chemical elements. Furthermore, some EEM parameter sets available in literature employ atom types which depend not only on chemical element (or symbol), but also on the maximum bond multiplicity. This means that in such EEM parameter sets there are, for example, different EEM parameters for carbon atoms with sp³ hybridization, than for carbon atoms with sp² hybridization. In order to keep a consistent scheme of storing and assigning parameters, ACC implements by default an EEM parameter scheme which supports bond multiplicity information via the tag Bond and its attribute Type. EEM parameter sets which are based solely on chemical elements contain multiple Bond tags as well, but the parameters associated with different bond multiplicities (Type attribute) are actually merely copies.

Because the unified parameter scheme employs bond multiplicity information, ACC needs to establish the chemical element of each atom, as well as its maximum bond multiplicity, so that it can assign a suitable atom type, and subsequently EEM parameters. When uploading the input file, ACC expects to find the chemical element at a pre-defined position in the input file, which depends on the formal guidelines established for each file format. For example, in .pdb files, ACC looks for the chemical element in the column after occupancy and temperature factor (positions 77-78).

As soon as the calculation starts, ACC' attempts to obtain bond information. First, it searches the input file for connectivity information. If this is not present, ACC attempts to establish the maximum bond multiplicity based on interatomic distances. This algorithm generally has trouble when interatomic distances vary significantly from the expected norms, or when handling coordinated atoms, and it may generate unexpected bond multiplicities for which no EEM parameters are available. To overcome such situations, ACC falls back to the EEM parameters for the nearest bond multiplicity available.

This fallback also happens when the input molecule contains an atom type with a maximum bond multiplicity which was indeed not covered during the development of the EEM parameters used in a given calculation. For instance, if the reference data used for the development of the particular EEM parameter set you chose did not contain any sp² nitrogen (Bond Type="2"), ACC will fall back to the EEM parameters for sp³ nitrogen (Bond Type="1"). ACC will produce a warning to inform you of this fact, and in the final results the atom type will still be N:2, as originally detected in the input file. It is up to you to decide if the values of atomic charges for these problematic atoms are acceptable.

Why did different EEM parameter set produce different charges?

EEM, the empirical method used by ACC to calculate atomic charges, relies on empirical parameters. Many EEM parameter sets have been developed using different reference data and fitting procedures.

EEM parameters are generally developed based on reference quantum mechanical (QM) calculations. A QM calculation is characterized by the setup of the wave function calculation (theory level, basis set, environment), and the type of observables that will be calculated and interpreted. Most commonly, QM reference data used for the development of EEM parameters consists of atomic charges, which are derived from the observable electron density according to a specific charge definition, meaning a procedure used to partition the molecular electron density, or to deduce the electrostatic contribution of each atom. Because atomic charges are not physical observables and have only a conceptual character, there is no unique charge definition that is universally accepted. Rather, a score of charge definitions have been published and are in use, each with their own strengths and weaknesses.

We denote as approach any association of a QM calculation setup and charge definition.

Approach: QM Method, Basis Set, Population Analysis
Description: association of a QM calculation setup and charge definition. Gives the nature of the reference QM data used during the development of the EEM parameters. The applicability domain of an EEM parameter set is closely related to the applicability domain of the reference QM data.
Possible values:
  • QM Method - level of theory used to solve Schrödinger's equation - HF, B3LYP, etc.
  • Basis Set - set of basis functions used to solve Schrödinger's equation - 6-31G*, STO-3G, etc.
  • Population Analysis - charge definition used after solving Schrödinger's equation to partition the molecular electron density, or to deduce the electrostatic contribution of each atom - MPA (Mulliken population analysis), NPA (Natural population analysis), MK (Merz-Kollman scheme for fitting to electrostatic potentials), etc.

Therefore, the first major factor which causes differences in charges is the charge definition, or rather the inherently different principles used to define the amount of electron density to be assigned to each atom. The setup of the reference QM calculations may also cause differences, but much smaller than those caused by the charge definitions. Note that such differences are to be expected, and can be quantified and analyzed in the Compare tab of the ACC Specifics page.

Further, even when two EEM parameter sets were developed using the exact same approach, they can still produce different sets of charges. The first reason is the size and nature of the training set that was used to produce the reference data. The training set, which is generally a set of molecules, actually defines the accessible parameter space. The second reason has to do with the parameter fitting procedure, which may have identified a completely different point in the parameter space even when the same training set was used.

Training Set Size, Data Source
Description: Number and type of molecules used to produce reference data during the development of the EEM parameters.

Last but not least, the differences might be caused by an error in the parameters. If you detect such an error, please let us know, so that we may correct it.

Why do residues have non-integer charge?

EEM, the empirical approach used by ACC to calculate atomic charges, works at the atomic level, and does not see the electronic structure. Nonetheless, due to the principle of electronegativity equalization, EEM allows electron density to spread across the molecule in a manner which depends on the nature of the atoms and the chemical environment created by the surrounding atoms. The degree to which this happens also depends on the charge definition and fitting algorithms used during the development of the EEM parameters (see above).

This means that atomic charges in a residue depend on the conformation of the residue, as well as the nature and conformation of nearby residues. The total charge on each residue may differ from the expected formal charge (-1, 0, +1) due to charge transfer to the surrounding residues, ligands, ions, water, etc. While this behavior is realistic, it may not be desired for some applications (e.g., some modeling programs expect integer charge on each residue).

Can I get good electrostatic potentials?

Whether or not atomic charges can generate accurate electrostatic potentials depends on several factors. First, certain charge definitions (see above) are based on principles which relate atomic charges to electrostatic potentials. Therefore, if you expect to compute potentials based on the atomic charges, you should probably pick an EEM parameter set developed for a suitable charge definition (MK, CHELPG...). Second, the concept of atomic point charges is inherently limited with respect to describing charge gradients, therefore the resulting potentials in some areas of the 3D space around the molecule will be better described than in some other areas.

Further, one should keep in mind that EEM, the method used by ACC to calculate atomic charges, is an empirical approach. EEM parameters available in ACC were mostly fitted to reference data in the form of atomic charges from quantum mechanical (QM) calculations. EEM is an approximation meant to keep as much accuracy as possible (compared to reference data) while maximizing computational efficiency. Therefore, the maximum accuracy to be expected for EEM atomic charges cannot exceed the accuracy of the corresponding QM charges in reproducing electrostatic potentials.

Finally, note that some papers provide straightforward evaluations of the ability of their EEM parameter sets to reproduce electrostatic potentials from QM calculations. So follow the citation of the EEM parameter set you plan to use, and see if this information is available in the original paper. Note that the current implementation of ACC provides only the values of atomic charges, and you will have to compute the electrostatic potentials yourself (e.g., on the pdb2pqr server). In the future ACC might support such functionality.

Can I get good dipole moments?

Atomic charges are non-integer numbers quantifying the balance of positive (nuclear) charge and negative (electronic) charge associated with each atom. In the 3D space, atomic charges represent points placed at the position of the atomic nuclei, and may be termed atomic point charges. The molecular representation based on atomic point charges is thus a very basic abstraction of the molecular electron density.

Atomic charges are conceived to reflect the uneven distribution of electron density in the molecule. When employing atomic charges, you must be aware of the limitations inherent to the atomic point charge model. A single number can give an idea about whether there is more electron density around some atoms compared to others, but it cannot characterize the actual distribution of electron density in the space between the atomic nuclei. Thus, all properties which flow from this distribution are generally not well described using atomic charges.

Dipoles and higher order multipoles are known to be poorly approximated by a point charge model. Dipole moments measure the degree of separation of positive and negative charge in the molecule (polarity), and are, by definition, very sensitive to small variations in the distribution of electron density. Even atomic charges computed at the quantum mechanical (QM) level have trouble reproducing dipole moments, though some charge definitions (see above) are less unsuccessful than others (MK charges can be satisfactory for estimating dipole moments for small molecules). It is thus clear that one cannot expect the accuracy of empirical models fitted to QM reference charges to exceed the accuracy of QM atomic charges in reproducing dipole moments.

Nonetheless, it is not unlikely that you obtain reasonable results for relative dipole moments in series of derivatives of a certain chemical compound. In other words, the atomic charges obtained from ACC calculations may not provide accurate dipole moments for a single molecule, but it is possible to compare the polarities of many kinds of derivatives of this molecule. Note that the current implementation of ACC provides only the values of atomic charges, and you will have to compute the dipole moments yourself. In the future ACC might support such functionality.

Start by having a look at the main terms used by ACC, or return to the Table of contents.