A user-friendly Matlab program and GUI for the pseudorotation analysis of saturated five-membered ring systems based on scalar coupling constants.

BACKGROUND
The advent of combinatorial chemistry has revived the interest in five-membered heterocyclic rings as scaffolds in pharmaceutical research. They are also the target of modifications in nucleic acid chemistry. Hence, the characterization of their conformational features is of considerable interest. This can be accomplished from the analysis of the 3J(HH) scalar coupling constants.


RESULTS
A freely available program including an easy-to-use graphical user interface (GUI) has been developed for the calculation of five-membered ring conformations from scalar coupling constant data. A variety of operational modes and parameterizations can be selected by the user, and the coupling constants and electronegativity parameters can be defined interactively. Furthermore, the possibility of generating high-quality graphical output of the conformational space accessible to the molecule under study facilitates the interpretation of the results. These features are illustrated via the conformational analysis of two 4'-thio-2'-deoxynucleoside analogs. Results are discussed and compared with those obtained using the original PSEUROT program.


CONCLUSION
A user-friendly Matlab interface has been developed and tested. This should considerably improve the accessibility of this kind of calculations to the chemical community.


Background
Five-membered heterocyclic ring systems constitute an important part of many biologically relevant molecules. They occur in carbohydrates (furanoses), nucleosides and nucleotides, the amino acid proline and their many derivatives. In addition, they often occur as a moiety in complex natural products. Chemical modifications of nucleic acids, often driven by the needs of antisense research, target in part the five-membered cycle or its analogues in order to tailor their conformation towards the desired needs [1,2].
Typically, the chemical and conformational space is explored by introducing a diversity of substituents at varying positions around the cycle. Depending on the position and nature of these substituents, the cycle either adopts a single conformation or may be in equilibrium between two conformations. These conformations will in turn impact on the conformational space that will be covered by the substituents, making the determination of the cycle's conformation an issue of considerable interest.
Over the years, NMR has become a well-established technique for this purpose. In particular, 3 J HH scalar coupling constants are well-suited as they are mainly determined by the torsion angle over which they are measured. In the case of ring systems, the vicinal 3 J HH scalar coupling constants are directly correlated to their corresponding exocyclic torsion angles ( exo ). These are related to the corresponding endocyclic torsion angles ( endo ) by a simple equation (1) where A and B are constants determined by the geometry of the atoms linked to the common central bond.
As the set of all five endocyclic torsion angles in a fivemembered ring fully determines its conformation, 3 J HH scalar coupling constants provide a direct measure of the ring's conformation. The Haasnoot-Altona equation (3) [12] and the Diez-Donders equation (4) [13,14], both based on the well known Karplus equation (2), describe the relation between a 3 J HH coupling and the corresponding exocyclic torsion angle ( exo ) to a high level of accuracy. In both equations this is mainly achieved by including a set of four parameters  i (i = 1, ..., 4) that account for the influence of electronic effects contributed by the substituents [15,16]. In some studies, the set of experimental 3 J HH scalar coupling constants is further extended by 3 J HF scalar coupling constants [17][18][19] or interproton distances obtained by nOe NMR experiments. Here however, we assume that only 3 J HH scalar couplings are available for conformational analysis.
3 J HH = P 1 cos 2 () + P 2 cos() + P 3 Altona and Sundaralingam showed that the description of a five-membered ring conformation can be reduced to a two-parameter pseudorotation model [20,21] that fully describes its conformation. The first parameter, the pucker phase P, represents the phase of the conformation and indicates which ring atoms are positioned out of the ring plane. The second parameter, the pucker amplitude  max , corresponds to the amplitude of the conformation and describes the extent to which the atoms determined by P are out of the plane. The relationship with the endocyclic torsion angles  endo, i is shown in (5).
This well-known pseudorotation description, originally described for the furanose ring in nucleosides and nucleotides [20,21], was further generalized to any five-membered heterocycle by Diez et al. [22][23][24] who introduced two additional parameters  i and  i for each endocyclic bond to cope with differences in bond lengths in various types of five-membered rings (Equation 6). As the phase of the conformation P is a periodic variable, polar plots called pseudorotation wheels are mostly used to depict ring conformations.
Using the above equations, 3 J HH couplings can be used to derive the pseudorotation parameters of the five-membered cycle. As mentioned previously, the cycle may be in equilibrium between two conformations. Thus, most generally two sets of pseudorotation parameters (P and  max ) and the relative population (% 1 , i.e. the percentage of the first conformation present with % 2 = 1 -% 1 ) need to be fitted to the experimental NMR data. In order to avoid an under-determined model, experimental data measured at different temperatures is generally used. In such cases, the model assumes that only the relative population of the two conformations varies when changing the temperature. Thus n + 4 (n being the number of temperatures used) variables will be optimized to fit the experimental data in such cases. To the best of our knowledge, the program PSEUROT [25], originally developed by Altona et al., is still the only generally available program to perform this type of analysis. Written in FORTRAN, its interface as well as its output is purely text-based. In order to facilitate the analysis of the PSEUROT results, a post-processing feature has been included in the independently developed MULDER package [26] to generate a graphical output of the PSEUROT results. In this communication, we propose an integrated, user-friendly Matlab program, including a self-explanatory graphical user interface (GUI), to facilitate the set-up, execution and subsequent analysis of pseudorotation calculations for five-membered ring systems decorated with a variety of substituents. The use of Matlab as high-level programming language enables to create, within a limited time frame, high-quality plots that provide a graphical impression of the conformational space accessible. Furthermore, due to the open-source GNU 1 J P cos P cos P P P cos P HH i i GPL license, users have the opportunity to adapt the program to their specific needs.

Implementation
The program consists of a computational core that is accessed and controlled through a GUI (Figure 1), both written in Matlab. Its goal is to search pseudorotation parameters {P,  max } for at most two conformations as well as their relative population (% 1, n ) at n temperatures that fit a series of experimental NMR scalar coupling constants. The initial pseudorotation parameters are set by the user at the start of the computational procedure. The user is also given the choice to define a subset of pseudorotation parameters that have to be optimized. Using equations 1, 4 and 6, the scalar coupling constants relating to conformation defined by the initial pseudorotation values are calculated and a root-mean-square-deviation (RMSD) is determined with respect to the experimental coupling data. Next, the pseudorotation parameters are adapted so as to minimize this RMSD. The Matlab fmincon function from the Optimization Toolbox, which is based on a Sequential Quadratic Programming (SQP) algorithm [27], was used for this purpose. To restrict the optimization to physically sensible solutions, values of partition coefficients were restrained to the [0, 1] interval. Furthermore, puckering amplitudes ( max ) were restricted to the A choice is provided between two operational modes. In the first mode, an optimization of the chosen parameters is performed as is, yielding the pseudorotation parameters that best fit the experimental data. The output generated in this operation mode is purely text-based and contains the optimized variables, their corresponding endocyclic torsion angles and a tabular comparison between experimental and fitted scalar coupling data (See accompanying manual). In the second mode of the program, the complete pseudorotation space of the cycle of interest is explored via 3600 combinations of pseudorotation Whether the parameters of a second conformation and the relative populations at each temperature are optimized during each of the 3600 runs is again determined by the user's choice. This results in two possible outputs for this operating mode. For the case where the user chooses not to optimize any parameters of the second conformation, one can easily assess from the pseudoratational wheel if the experimental data can be fitted by a single five-membered ring conformation. For the case where the second conformation's parameters (and the relative population of both conformations) are also to be optimized, the pseudorotation wheel represents the conformational space accessible to the cycle when two fivemembered ring conformations are fit to the experimental data. In this latter case, one additional pseudorotation wheel is created for each temperature depicting the relative population of the two fitted conformations for each set of {P,  max } using contour levels.
This second mode of calculation has some interesting advantages over the first mode. First, a better appreciation of the conformational space of the cycle that is in agreement with the experimental data is obtained. Furthermore, the extent of the contour levels in the pseudorotation plot allow to establish whether the model is under-determined or not. Even when under-determined, such calculation can already indicate those conformations that can be excluded from further investigation.

Program features
Only the main features of the program, fully described in the manual available together with the program, are discussed. The GUI is easy to use and handles the input as well as the various operational modes and data output ( Figure 1). Two modes of calculation for fitting five-membered ring conformations to 3 J HH scalar coupling data obtained at up to five temperatures can be selected. Both the coupling data and additional parameters required by equations 1 and 6 are defined interactively, using a simplified representation of the cycle (Figure 1, upper panel). Both the atom types in the cycle and the substituents can be defined interactively. The GUI includes an 'electronegativity editor' (Figure 1, bottom panel) which provides a convenient graphical aid to set up the group-electronegativities ( i ) for many common substituents as required by equation 4.
Also, the program is tolerant for sparse input data. When only a sum of two or more 3 J HH scalar couplings can be measured, e.g. due to overlap, this sum can be included in the model instead of the individual couplings. Contrary to the PSEUROT program, a coupling does not necessarily have to be measured at each temperature to be included in the calculation.
Four different parameterizations of equation 4 are implemented in the program. These include the 20-parameter and 12-parameter parameterization by Donders [14] and two slightly different 9-parameter parameterizations [28,29] one of which is used in PSEUROT 6.2. The use of Diez-parameters into the calculations is fully implemented by using equation 6 for the calculation of exocyclic torsion angles out of the puckering parameters. The current version does not include the use of the Barfieldcorrection [30,31] used in some studies.
Depending on the calculation mode selected, textual or graphical output is provided. As the program is written in Matlab, the latter plots can be easily post-processed to comply with the user's needs. In addition, several parameters involving the construction of the plots can be set manually. These parameters include the resolution of the search grid used for the pseudorotation scan (i.e. the increment in P and  max ) and the minimal and maximal contour levels used in the plots. Furthermore, interpolation of these contour plots using cubic spline functions can be selected for obtaining smoothed graphs.
Standard data sets for -D-ribose and -D-deoxyribose as well as input data of 1 and 2 are available with the distribution of the software.

Comparison to PSEUROT
To test the performance of the computational core and the input/output handling using the GUI and compare the results with respect to PSEUROT (version 6.2), a set of two 4'-thio-2'-deoxynucleoside analogs [32] was used. For both molecules, depicted in Figure 2, literature [32] provides a full set of five coupling constants measured at five different temperatures as well as Diez-parameters for each endocyclic bond. For both compounds, two conformations were fit to the experimental data using our Matlab GUI and PSEUROT 6.2. The most recent parameterization of equation 4 was used. Details of these optimal fittings, as well as a third fit, taken from literature [32] and referred to as LHK, using the Haasnoot equation 3 [12] are presented in Table 1. This table includes  In order to get a better appreciation of the precision of these optimal values, the full pseudorotation wheel of 1 and 2 has been scanned by the Matlab program resulting in Figures 3 and 4. In Figure 3, contour plots indicate the root-mean-square-deviation between the fitted and the experimental scalar coupling couplings for 1 (left) and 2 (right). Looking at the extent of the contour lines, it is clear that the South-conformation is better defined than the North-conformation. Taking into account errors originating from measuring scalar couplings, deficiencies of the model and the Karplus equation, the optimal fit does not necessarily correspond to the real situation. Therefore, interpreting the regions that have a RMSD lower than a certain threshold (e.g. 0.2 Hz), as representative of the cycle's conformation seems more reasonable than taking the optimal fit for granted, especially when comparing different compounds. In addition to Figure 3, Figure 4 can Pseudorotation wheel for compounds 1 and

Conclusion
A matlab program with easy to use graphical user interface for the calculation of five-membered ring conformations has been presented. This program has been made freely available online under a GNU GPL license. The performance of the program was tested on a set of two 4'-thio-2'deoxynucleoside analogs. It has been shown that identical results can be obtained as in PSEUROT. Furthermore, high-quality graphical output can be generated, facilitating the interpretation of the calculations.