Solid-phase molecular recognition of cytosine based on proton-transfer reaction. Part II. supramolecular architecture in the cocrystals of cytosine and its 5-Fluoroderivative with 5-Nitrouracil

Background Cytosine is a biologically important compound owing to its natural occurrence as a component of nucleic acids. Cytosine plays a crucial role in DNA/RNA base pairing, through several hydrogen-bonding patterns, and controls the essential features of life as it is involved in genetic codon of 17 amino acids. The molecular recognition among cytosines, and the molecular heterosynthons of molecular salts fabricated through proton-transfer reactions, might be used to investigate the theoretical sites of cytosine-specific DNA-binding proteins and the design for molecular imprint. Results Reaction of cytosine (Cyt) and 5-fluorocytosine (5Fcyt) with 5-nitrouracil (Nit) in aqueous solution yielded two new products, which have been characterized by single-crystal X-ray diffraction. The products include a dihydrated molecular salt (CytNit) having both ionic and neutral hydrogen-bonded species, and a dihydrated cocrystal of neutral species (5FcytNit). In CytNit a protonated and an unprotonated cytosine form a triply hydrogen-bonded aggregate in a self-recognition ion-pair complex, and this dimer is then hydrogen bonded to one neutral and one anionic 5-nitrouracil molecule. In 5FcytNit the two neutral nucleobase derivatives are hydrogen bonded in pairs. In both structures conventional N-H...O, O-H...O, N-H+...N and N-H...N- intermolecular interactions are most significant in the structural assembly. Conclusion The supramolecular structure of the molecular adducts formed by cytosine and 5-fluorocytosine with 5-nitrouracil, CytNit and 5FcytNit, respectively, have been investigated in detail. CytNit and 5FcytNit exhibit widely differing hydrogen-bonding patterns, though both possess layered structures. The crystal structures of CytNit (Dpka = -0.7, molecular salt) and 5FcytNit (Dpka = -2.0, cocrystal) confirm that, at the present level of knowledge about the nature of proton-transfer process, there is not a strict correlation between the Dpka values and the proton transfer, in that the acid/base pka strength is not a definite guide to predict the location of H atoms in the solid state. Eventually, the absence in 5FcytNit of hydrogen bonds involving fluorine is in agreement with findings that covalently bound fluorine hardly ever acts as acceptor for available Brønsted acidic sites in the presence of competing heteroatom acceptors.

number of competing weak forces. Therefore, there is a great interest for the interaction of both small molecules and proteins with DNA as well as for the function of resulting complexes.
Among several non-covalent binding interactions (i.e. hydrogen bonding, ionic interactions, van der Waals and π-π stacking), hydrogen bonding is very commonly used by chemists for the de novo design of self-assembled or self-associated compounds, because of its strength and directional properties [1,2]. This is especially true for biological structures, and in the last decade in this Laboratory considerable efforts have been addressed on designing assemblies of nucleic acid bases with aromatic N-heterocycles in the solid state to mimic, via multiple hydrogen bonds, the base-pairing of nucleic acids. The results of these investigations have led to a number of structural studies [3][4][5][6][7][8][9][10][11][12][13][14][15].
As hydrogen bonds may be considered the partially activated precursors to proton-transfer reactions [16], whenever the hydrogen-bonding associations result in complete proton transfer an ionic compound is produced, and the non-covalent interactions between hydrogen-bonding groups are reinforced. The relevance of proton transfer in DNA/RNA systems was raised many years ago. A few years after Watson and Crick's suggestion that the genetic code may be perturbed by the formation of nucleic acid bases (NABs) in so-called rare (not canonical keto-amine) tautomers [17], in a pioneering work Lowdin introduced the hypothesis that rare tautomeric forms could be produced in pairs by intermolecular single/double proton transfer (SPT/DPT) reactions in DNA within the hydrogen bonds connecting a base pair [18]. If, during the replication of DNA, instead of normal combinations of complementary NABs other combinations are possible, the normal hydrogen-bonding pattern in DNA is altered and the sequence of bases in recovered DNA is different and leads to spontaneous mutations. Many theoretical studies have been devoted to check Lowdin's hypotheses [19,20]. At present, for neutral systems all studies agree that the SPT reaction is less favorable than the DPT one, as the single transfer process implies a charge separation when forming the ion-pair complex, while in the DPT process the electroneutrality is retained. Nevertheless, the energy barrier is high, and the double tautomer is thermodynamically unstable. Thus, DPT reaction is not expected to have mutagenic effects. In contrast to this, for the protonated base pairs the SPT products are largely stabilized, since the SPT reaction does not imply the creation of an ion pair but just the transfer of a positive charge. Products arising from such processes are stable and can be involved in mutagenic phenomena. Protonation of NABs also contributes to stabilization of unusual DNA structures like triple helix, which is greatly stabilized at acidic pH, and knowledge about attachment of the proton is essential for the design of new intercalating drugs that stabilize the triple helix [21].
Among the four DNA bases, cytosine (Cyt) has been the focus of much research along these lines, as it plays a crucial role in DNA/RNA base pairing, through several hydrogen-bonding patterns, and controls the essential features of life as it is involved in genetic codon of 17 amino acids. Moreover, protonation at N3 of the cytosine ring (according to the numbering scheme given in Figure 1) is a necessary step in homo-base pairs association. A well-known example is the i-motif, based on the formation of quadruplex structure involving Cyt-CytH + reversed mismatch pairs for polyCyt at acidic pH [22]. In this respect, the supramolecular structure of cytosine coupled with uracil acidic derivatives can be regarded as a model in the solid phase of molecular recognition based on proton-transfer reactions.
A possible guide for the synthesis of neutral or charged components in hydrogen-bonded molecular adducts formed through the transfer of a proton can be the Dpk a [pk a (conjugate acid of the base) -pk a (acid), pk a 's are for aqueous solution at 25°C] [23]. It is generally accepted that for large Dpk a (i.e. greater than 3) salts of the type B + -H ... Aare formed. Smaller Dpk a (less than 0) will almost exclusively result in neutral component B ... H-A compounds (cocrystal), but that parameter seems inappropriate for accurately predicting salt or cocrystal formation in the solid state when Dpk a is between 0 and 3 [24,25]. The proton-transfer process can be improved through the use of stronger Brønsted acids and/or bases, and indeed cytosine (pk a1 = 4.6 and pk a2 = 12.2, [26]) is readily protonated at the N3 position in the presence of strong acids. Even though this molecule is particularly amenable to the formation of molecular complexes from proton-transfer reactions, the first example of solid-state molecular recognition of cytosine by acidic nucleobase derivatives has appeared only recently [27].
Replacement of hydrogen or hydroxyl group by fluorine in a bioactive compound often imparts, or improves, desirable biochemical and/or pharmacological properties (i.e. 5-fluorouracil). Fluorination is commonly regarded as an isosteric monovalent substitution, since the van der Waal's radii are 1.20 Å for H, 1.40 Å for OH and 1.47 Å for F [28]. Thus, a monofluorinated analogue is geometrically very similar to its parent molecule and hence meets the steric requirements at enzyme receptor sites [29][30][31][32][33]. The effect of fluorine as a substituent in biomolecules can be attributed to the strong electronwithdrawing properties (and on electron pair donating mesomeric effect in conjugated systems). It should be noted that the ability of C-F groups to act as a weak hydrogen bond acceptors (1-3 vs 5-10 kcal/mol for oxygen as an acceptor) turned out to be the most discussed (and controversial) issue for organic fluorine in literature [34][35][36][37][38]. Since, as anticipated, hydrogen bonds are indispensable features in higher-ordered DNA/RNA structures, this hard-argued aspect increases the value of fluoro-modified nucleobases in molecular recognition.

Results and discussion
The asymmetric unit of molecular salt (I) is shown in Figure 1 and consists of one protonated (CytH+) and one neutral cytosine (Cyt) aminooxo tautomers, coplanar with one neutral (Nit) and one anionic (Nit -) diketo tautomers of 5-nitrouracil, linked by multiple hydrogen bonds in a plane along with two water molecules of crystallization. A complete deprotonation occurs as a result of the proton-transfer process from N11, the more acidic of the two sites available for ionization in the heterocyclic ring of Nit [13], to the N3 atom of the pyrimidine ring of Cyt. The H atom at the N3 position in CytNit was located in difference Fourier maps and is probably not entirely located at the nitrogen site. The unusual displacement parameter, 0.12 (2) Å 2 , of H3 suggested to investigate a model in which the hydrogen atom is disordered between two positions in the central N3-H3 ... N3a hydrogen bond. Attempts in the current work to quantify the hydrogen atom disorder directly from the refinement of the hydrogen atom site occupancy factors (SOFs) from the X-ray diffraction data C12a Figure 1 The asymmetric unit of CytNit, showing the atom-labeling scheme and hydrogen bonding (double dashed lines). Displacements ellipsoids are drawn at 50% probability level and H atoms as small spheres of arbitrary size. proved to be somewhat problematic. Indeed, bonding effects and correlation of SOFs with thermal parameters make the obtained hydrogen atom occupancies less reliable. A refinement strategy was adopted that fixed the isotropic thermal factors (ITFs) of the disordered hydrogen atoms sites to be equal to the average of the other hydrogen atom ITFs. This model produced unstable refinements. A neutron diffraction study would be needed to make any further observations about the behavior of this hydrogen atom. The N3 protonation or its absence reflects in the C2-N3-C4 bond angle. The N3 protonation in CytH+ is consistent with the larger C2-N3-C4 bond angle, 122.5 (3)°, while for unprotonated Cyt the angle is 121.1 (3)°. Nevertheless the latter value, when compared with the corresponding one reported for cytosine, 119.4(2)° [40], could suggest again that H3 is partly shared in the structure. The prevailing protonation site is further corroborated by a general comparison of the molecular geometry of the base ring of CytH+ in the molecular adduct (Table 1) with that observed in a number of structures with protonated cytosine [27]. Minor exceptions can be attributed to the different hydrogen bonding configurations. Concerning molecular dimensions of the 5-nitrouracilate anion, bond lengths and bond angles of the heteroaromatic ring are in accord with values obtained for Nitin the (1:1) benzamidinium 5-nitrouracilate adduct [14]. The two uracil derivatives are coplanar, as in Nit and in Nitthe nitro groups form dihedral angle of 1.4 (1)°and 1.1 (1)°with the mean plane of the pyrimidine rings.
In the supramolecular structure of molecular salt (I), the hydrogen-bonding scheme is rather complex, and is characterized by sixteen unique two-and three-center intermolecular hydrogen bonds, namely ten N-H .  Table  2). For descriptive purposes, it is convenient to select a 'superadduct' consisting of one asymmetric unit and then analyze firstly the hydrogen bonding within this aggregate, and secondly the hydrogen-bonding patterns between neighboring individual superadducts ( Figure 3).
As previously mentioned, in the crystal structure of CytNit each asymmetric unit comprises four molecules    3 (10) and R 4 4 (10) motifs, connect the one-dimensional polymeric chains, thereby generating a two-dimensional supramolecular hydrogen-bonded network parallel to the bc plane. The formation of this two-dimensional array is facilitated by water molecules, which act as bridges between superadducts.
The asymmetric unit of compound II is shown in Figure 4 and consists of (1:1) double hydrogen-bonded canonical aminooxo and diketo tautomers of 5-fluorocytosine and 5-nitrouracil molecules, respectively, assembled with two water molecules of crystallization to form a dihydrated cocrystal. Indeed, the two nucleobases are in the neutral form as fluorine, being the most electronegative atom, significantly reduces the basicity of close basic groups in the 5-fluorocytosine molecule (pk a1 = 3.3 and pk a2 = 10.7).
In 5FcytNit the two nucleobases are essentially coplanar, as in Nit the nitro group forms dihedral angle of 4.9 (4)°with the mean plane of the pyrimidine ring. The molecular geometry of the two components of the cocrystal (Table 1) largely agrees with the already known solvent-free structure of the two units [42,43].
The crystal structure of compound (II) is shown in Figure 5. 5-fluorocytosine and 5-nitrouracil molecules are associated in the crystal by extensive hydrogen bonding into a three-dimensional network ( Table 2)

Conclusion
The supramolecular structure of the molecular adducts formed by cytosine and 5-fluoro cytosine with 5-nitrouracil have been investigated in detail. CytNit and 5Fcyt-Nit exhibit widely differing hydrogen-bonding patterns, though both possess layered structures. The crystal structures of CytNit (Dpk a = -0.7, molecular salt) and 5FcytNit (Dpk a = -2.0, co crystal) confirm that, at the present level of knowledge about the nature of protontransfer process, there is not a strict correlation between the Dpk a values and the proton transfer, in that the acid/base pk a strength is not a definite guide to predict the location of H atoms in the solid state. Eventually, the absence in 5FcytNit of hydrogen bonds involving fluorine is in agreement with findings that covalently bound fluorine hardly ever acts as acceptor for available Brønsted acidic sites in the presence of competing heteroatom acceptors.

Experimental
All materials (Aldrich Chemical Company, 99%) were used as received without further purification. Cytosine and 5-fluorocytosine (1 mmol of each compound) were dissolved in hot water (15 ml each solution) and added to a 20 ml hot water solution of 5-nitrouracil in equimolar ratio. After concentration to ca 30 ml, the resulting solutions were stirred at 50°C for 24 hours under reflux. After two weeks small transparent single crystals were obtained from the slow room-temperature evaporation of the two solutions and then used for Xray diffraction experiments.

X-ray Crystallography
The intensity data were collected on the Oxford Diffraction Xcalibur S CCD diffractometer with graphite-monochromated Mo Ka radiation (l = 0.71069 Å) at 298 (2) K operated at 50 kV and 40 mA. The data reductions were performed using the CrysAlis software package [44]. Solution, refinement and analysis of the structures were done using the programs integrated in the WinGX system [45]. The structures were solved by direct methods (SIR2002) [46] and refined by the full-matrix least-squares method based on F 2 (SHELXL-97) [47]. The non-hydrogen atoms were refined anisotropically till convergence was reached. All the hydrogen atoms were located in a difference Fourier map and refined isotropically, with the exception of those hydrogen atoms linked to the C-ring atoms, which were positioned with idealized geometry and refined as riding on their parent atom [C-H = 0.93 Å]; the U iso 's (H) were set as 1.2 times the U eq value of the appropriate carrier atom. At this stage difference Fourier maps showed for both molecular adducts values not exceeding 0.25 (4) e Å -3 which, however, are not of chemical significance. Geometrical calculations were performed using PLATON [48]. The figures were prepared using ORTEP-3 [49]. The final crystallographic data collection and refinement are summarized in Table 3. CCDC reference numbers: 827539 and 827540. All CIF information can be found in Additional file 1.