Rings or daggers, axes or fibulae have a different composition? A multivariate study on Central Italy bronzes from eneolithic to early iron age

Background One of the main concerns for archaeo-metallurgists and archaeologists is to determine to what extent ancient craftsmen understood the effect of metal alloy composition and were able to control it in order to produce objects with the most suitable features. This problem can be investigated by combining compositional analyses of a high number of ancient artefacts with correlation analyses of the objects’ age, production site, destination of usage etc. – and thus chemometric data treatment is carried out. In this study, multivariate analyses were performed on a matrix composed of elemental compositional data from 134 archaeological bronze objects, obtained by XRF analyses. Analysed objects have been dated back from the Eneolithic Period to the end of the Bronze Age including the early Iron Age and were excavated in Central Italy (mainly Abruzzo Region). Results Chemometric analysis was performed attempting to visualise clouds of objects through PCA. In parallel and independently, object grouping was attempted using several different approaches, based on object characteristics (e.g. shape, weight, type of use – cutting or hitting and age) following indications given by archaeologists (or derived from the archaeological context). Furthermore, case-tailored data pretreatment (logratio-centred scaling) was used, but no homogeneous groups could be identified. Conclusions By using chemometric data analysis, homogeneous groups of objects could not be detected, meaning that compositional data of alloys is not correlated with the considered objects’ characteristics. This favours the conclusion that – without discussing the ascertained ability of ancient foundry-men - they had also already discovered the convenience of recycling broken objects thus producing a more or less similar bronze alloy each time, depending on materials’ availability; necessary mechanical characteristics could then be obtained by post processing. Graphical Abstract Scores PCA plot in 3D space with 3 different bronze objects. Electronic supplementary material The online version of this article (doi:10.1186/s13065-015-0090-7) contains supplementary material, which is available to authorized users.


Background
A huge number of small bronze fragments can be found in Italy. One of the oldest finds, an axe of pure copper used by Ötzi the Iceman, was found in the Alps of Ötztal, at the boundary between Italy and Austria in 1991 [1]; it dates back to about 3300 b.C. as stated by archaeologists and denotes the overcome of the final Bronze and Copper Ages [2]. Items of copper and its alloys are very common metallic cultural heritage objects.
The use of native copper for the production of small pieces of jewelry, weapons and tools by hammering and partial melting, goes way back in time to 7000 b.C. .
Later, similar to nowadays, copper objects were also obtained by smelting ores [3]. In this case, objects generally show characteristic Cu(I) oxide inclusions, formed during the smelting process. The presence of other elements (arsenic, iron, zinc, lead, nickel, antimony, silver) leads us to believe that the production of alloys was probably accidental in the beginning and then later used by metalworkers to influence the colour and other properties (e.g. workability, hardness, etc.) [4] of metal. Further knowledge on this ability of ancient metalworkers is of main concern for metallurgists and archaeologists but the distinction between impure copper artefacts and early bronze ones is not always immediately obvious. In fact, the first copper alloy, copper-arsenic, is thought to have an accidental origin because arsenic is often associated with copper-sulphite minerals.
Therefore these alloys probably result from smelting copper ores containing arsenic or mixed copper-arsenic minerals, olivenite or clinoclase [5]. A golden colour characterises these alloys. Bronze, as an intentional alloy of copper and tin, began to be made between 3000-4000 b.C. probably with the intent to overcome the softness of "pure" copper (only marginally useful for the production of tools). The presence of tin increases both the alloy's castability and the product's hardness. When the tin content goes beyond 20% the alloy gets a silver-white and glossy appearance (mirrors were made by the Romans with an alloy called "speculum" holding a tin content of 19 -33% and by the Chinese with a tin content ranging from 24 to 50%) [6][7][8][9]. The practice of adding limited amounts of lead (generally less than 2%) to bronze to increase workability and fluidity began around 1000 b.C. and was fully developed by the Romans. As lead segregates in the alloy (not soluble in copper) it can be recognised in a metallographic section as small dark spots. In what is called lead-bronze the lead content may be as high as 10% with the intent of improving the alloy's softness and so castability (when fine details are of interest).
When the opposite result is desired, as for the production of cutting tools, which have to have a cutting-edge, antimony or arsenic can be added to bronze producing antimonial or arsenical bronze.
During the Empire of Augustus, the Romans began to add limited amounts of zinc to copper in order to improve castability and obtain a bronze-like alloy called brass [10].
Chinese smelters added zinc to bronze to improve workability and to obtain a whitish appearance. The presence of nickel (above 1%) was detected in bronze objects produced by the Sumerian and Syrian civilisations during 4000 -3500 b.C. and in China, nickel was added to obtain a silverish appearance. Again antimony was found mainly in copper-base objects produced in tin-poor regions, like the Caucasus, [11].
The use of copper was and is probably so common and widespread a , not only due to its characteristic colour or its easy workability, but also thanks to its durability, as in former times an objects life-time was of high importance; the life-time of copper and copper alloy objects is high, when the right maintenance treatments are observed.
However, as a metal it is subjected to an oxidation phenomena and archaeological objects are always covered by more or less thick corrosion layers, depending on intrinsic (alloy composition and structural features) and extrinsic (e.g. soil characteristics) variables [12][13][14]. So, when the composition of an archaeological object is studied today, regardless of the method of analysis used, the fact that the measure obtained is not the exact representation of the original alloy composition [14] must be taken into consideration. Even if only the remaining metal bulk is analysed, its composition can have changed due to preferential corrosion of certain alloying elements and the leaching phenomena. Bearing this fact in mind, compositional studies can hardly aim to determine the exact concentration of an element [15] in the original ancient alloy but rather aim to detect intentional differences created by ancient metalworkers in order to answer the questions of archaeometallurgists and archaeologists.
The aim of the present study was a new attempt [16] to detect a correlation between composition, age, type of object and its destination of usage; this time on a sample of 134 archaeological bronze objects excavated in Central Italy (mainly Abruzzo Region), see Figure 1, dating back from the Eneolithic Period to the end of Bronze Age including the early Iron Age. In Figure 2 a photo of a measured object is shown.
As in the previous work [16], compositional data was provided by micro-invasive ED-XRF analysis, but here object types were easily recognisable and thus so was their destination of usage; Table 1 summarizes data on the objects studied: usage (according to shape) and the number of measurements. Compared to other laboratory techniques for elemental analysis like AAS, IC or MS, [17] ED-XRF analysis is more widely used for bronze studies. This is mainly due to the instrument's easy portability which, although unable to detect certain elements and lacking in accuracy, in any case allows class' recognition based on main elements [18].
To what extent did ancient foundries control the concentration of elements, and principally, were they able to/interested in recognising the composition of a broken object before reusing or re-melting it? These questions are the archaeological premise for the present study, centred not on the measurement methodology but aiming to investigate data correlation by particular chemometric treatment in order to confirm or reject the hypotheses that ancient metalworkers reused/recycled entire objects or parts of them.

Results and discussion
Production zone, foundries in the central part of Italy and in particular the Abruzzo region no large quarries for mineral extraction for the production of bronze are present [19]; therefore bronze was obtained either by metallurgical processes with expensive imported minerals or probably, by recycling bronze objects. The compositional heterogeneity of Abruzzan bronze antiquities, also highlighted in this work, suggests that production was very often based on recycling processes.
The most antique items in this study are dated back to the Eneolithic Period; among them is a small axe of almost pure copper, probably one of the few objects made with rare native copper.
The majority of the antique bronze items considered in this study are objects of the "ripostiglio di Alanno".
Archaeometric investigation brought to light that in this period objects were already made with a tin rich bronze alloy, with tin content even higher than 10%.
Also the more recent pieces, either of the Bronze or Iron Age, show a relatively high tin content; this is probably related to the improvement of the mechanical resistance of such an alloy. Furthermore, tin can accumulate in  the corrosion patina, as observed on an object found near Sulmona, where the tin content reaches up to 15%.

Sampling method
Any Exploratory Data Analysis (EDA) starts with sampling; of course this chemometric analysis starts from a collection of objects stored in various museums in central Italy (measurements were conducted under the auspices of a CNR project started in 1997 with the aim to ameliorate the scientific knowledge on the immense inheritance of objects). After studying the variety (type and age) of the conserved objects, we selected a probability proportionate to size sampling method. A data set should contain a certain level of redundancy to ensure the method of calculation's precision (and/or homogeneity of alloy); in this case a third of the objects were measured twice or more.

Elemental analysis
For each object at least one measurement point was identified; selection criteria for the measurement point was finding a way to remove the patina without damaging the object. An example of patina removal to access to the "original material" is shown in Figure 2. and in Fig. A (please note that only figures named with numbers are shown in the text, while those named with Capital letters are located in the (Additional file 1)). The patina was abraded with a tiny diamond sphere to avoid contamination by any other metal and three ED-XRF measurements were performed (see Fig. B in the Additional file 1); their mean was then considered as "bulk" composition. When possible, an object was analysed in 2, or up to 5 points; in this case the collected data should allow estimation of the repeatability of the system object/instrument taking into account the heterogeneity of the antique alloy. Two different types of repetition were performed; 3 times on the same point without moving anything to evaluate the "precision" of the method and then on different points of the object to evaluate the "homogeneity" of the alloy. The next step was the computation of the statistic average value of detected elements, shown in Table 2. Fortunately, the average value of the 134 objects is in accordance with the typical composition of Bronze objects for the middle and late Bronze Age in Italy and France [9,20].
In the field of multivariate analysis and chemometrics the word "classification" has a well known meaning. Therefore, in the present paper we tried to avoid this term and to substitute it with the term "group", because we operated a categorization of objects which was formerly labeled with standard criteria and then differentiated them using an Attribute-Value system. After some discussion, we decided to use 3 criteria: age, shape and daily usage. After suggestions from an expert restorer, another criterion was added and thus another subdivision obtained. Further discussion produced further grouping methods; therefore 5 different Attribute-Value systems were produced, shown in Table 3 (with group dimension and selection criteria).
Measurement repeatability should be evaluated; however, when different compositional values were registered on different spots on the same object, deciding whether they were due to the alloy's very plausible heterogeneity or if there was a repeatability problem, was not possible. This is a common problem when working on archaeological material. However, initial calibration measurements also undertaken for quantification purposes on material with certified and similar composition should guarantee accuracy. 44 objects were measured twice or more; a moon shaped razor was measured seven times and obtained values are given as an example in Table 4, which shows a possible method for differentiating the reproducibility of the system instrument/object from the alloy heterogeneity. The measurements 838-1, 838-2, 838-3 were performed on the same point and the measurements 838-21, 838-22 on a different point; also 838-31 and 838-32 were performed on yet another point. Although the razor was a special case, it was measured in 3 different points, so obtaining 3 + 2 + 2 measurements.
The arithmetic mean, the median, the standard deviation and the median absolute deviation from median were calculated as well. Table 5 shows the spread among measures obtained on the 4 objects with measurement repetition (3 times on the same point (see Table 6), and the 7 objects measured on more than 3 points (see Table 5) using the Median Absolute Deviation as indicator.
Obtained statistical values are in accordance with typical data produced by applying an ED-XRF method.
Before putting the measurements obtained on different points of the same object together, using a central value descriptor, one must check for outliers. In Table 7 the 200 measurements were evaluated to find possible outliers; therefore values beyond the 95 percentile (median centred) are shown in bold.
The first decision criterion in the outlier analysis was to maintain all objects with only one outlier element, reducing the table by half. Immediately, the 5 measurements on the 5 Agemine containing both Fe and Cu were enhanced. All grouping attempts place these 5 objects in a separate group and thus they may be excluded for further computation. Further analysis of the table enhanced three objects of the very early Bronze Age, made from almost pure copper; being the only objects of this period, they could not be excluded. Then there is object n°35780, a casting residue made from almost pure metal; this object is, not an outlier. Lastly, object n°14228 could be an outlier due to high Pb, but careful table and raw data reading showed that all spearheads have an important Pb and Sn % (even if they come from the same settlement) and thus it will be kept, as well.
As a result, outlier analysis EDA has proved helpful, enhancing groups with extreme composition.
Unfortunately, the Dixon and/or Grubbs outlier test is not applicable because our distribution is not normal and the number of objects is too large (maximum number is 30 to 100 [21]).
At this point it becomes clear that, instead of using other non parametric methods like Peirce [22,23] or Chauvenet [24], a debatable method was used: objects were checked beyond the 95th percentile centred on the median; the same idea is sustained by MAD in Table 5.
In the electronic Additional file 1, frequency distribution charts of single elements were included : Fig. C After removing the outliers, the resultant matrix had dimensions of 129 rows by 7 columns (objects/metals): this is because 5 Fe-rich objects (falsely assigned to be bronze) were omitted. Furthermore, the metal As, detected in only 2 out of 134 objects from the matrix, was removed because it was identified as an outlier during the PCA

Data analysis
Matrix scaling often has a distorting or simplifying effect on successive multivariate data treatment [25].
To enhance the influence of scaling on data, Box-Whiskers plots are contained in the Additional file 1:  Fig. N log-ratio scaling. Based on experience treating datasets consisting of around 100 objects using the ED-XRF technique and following recommendations [26][27][28] log-scaling was selected for column pre-treatment.
Compositional data deriving from WD or ED-XRF measurements often contain a high number of "zero" values for the minor alloy elements; classification based on this data is a problematic issue. "In compositional data analysis we distinguish two kinds of zeros: essential zeros -or absolute absence of the part in the observation -and rounded zerosor presence of a component, but below detection limit" [29]. Distinction of the two types of zeros is not possible if values are closed to 100% on the object row, as in the present case.
A first attempt to separate groups was obtained using a scatter plot-matrix, without matrix pre-treatment, showing all possible var-var combinations. The scatter plot matrix graph (SPLOM) in Figure 3 shows that bronze objects are concerned as only the Cu-Sn scatterplot shows correlation. One object with high Zn is highlighted in red. This EDA graph method was useful for detecting particular situations like that of object n. 24486 (a fibulae measured in 4 points, see the Additional file 1): we can see the low value of Cu, the high value of Sn, but remaining elements show central values with respect to their distributions.
A 3D histogram with median values for each element was used in an attempt to identify whether there was a correlation between composition and grouping (5 groups as decided by the authors). in this paper, shows a decrease of Cu with age and a consequent increase of Sn in two steps, continuous variation of Pb, which is certainly intentional.
In the Additional file 1, the median of the concentrations of elements for all 134 objects is presented; various graphs, including the High Low Open Close (HLOC) for each metal and group, are shown. Again none of these graphic representations seem to indicate a solution to our problem. In similar situations, multivariate analysis has often offered a satisfying solution, as in the case of identifying different types of Imperial age marble [30], and the individuation of characteristic parameters of "glass paste" [31], where an unsupervised technique like Principal Component Analysis (PCA) enabled the identification of object groups which were not detectable by former mathematic and graphic methods.
The scatterplot matrix in Figure 5 (and Fig. T) is interesting; the Attribute-Values groups, as decided by the authors, are not correlated and thus it is reasonable to repeat projections for all group scores.  Chemometric analysis was performed in an attempt to visualise clouds of objects through PCA using all elements and also including the Agemina group. In Figure 6 the 3D representation of all objects and "usage groups" is shown. Weak separation is only evident in one group, on the upper right, composed by Fe-rich objects. This finding is encouraging for the development of the model and successive results.
Another small group on the far left of the chart can be identified, Although recalculation without the Fe-objects would probably produce better separation. Based on these representations, we decided to exclude the 5 Fe-rich objects, which compressed the potential separation of other objects, and to exclude As because, as shown in Table 2, it was detected in only 5 out of 200 measurements (e.g. in objects n°36521-a nail -and n°1 06647 -a sword). With these exclusions we obtained the previously cited matrix dimensions (129 × 7) with logratio centring as matrix pre-treatment; the ScreePlot in Figure 8 suggests visualisation of the first three PCs. In     PCA scores chart for all 134 objects and after log-ratio and column centring pre-treatment; using the GV grouping method described in Table 3 only 2 groups are enhanced: one with the Fe rich objects (upside) and one with the Eneolithic objects (on the left). Figure 9 the new loadings are shown confirming that the 3rd component has an indisputable importance. Figure 9 shows high values of Pb (positive) and Fe (negative) for PC1; only Fe has a positive high value in PC2, while Zn and Sb are in opposition in PC3.
In Figure 10a-c the scatterplot of the scores are shown for different combinations of the first three PCs, using the Attribute-Value grouping method suggested by author G.G. (based on supposed age of production).
In Figure 11 the three previous graphs are shown from a different point of view (3D) for better visualisation of the objects' projection; colours refer to object grouping as suggested by author GG.  PCA is unsupervised, it is not a classification method and the computation does not take into account our 5 defined grouping methods, which were thus visualized simply using different coloured markers; Table 8 shows the eigenvalues, RSS and PRESS values. Table A in the Additional file 1 shows the matrix after pre-treatment and the scores and loadings used for all following charts.
According to our customary approach to open discussion and sharing of data and results, in the Additional file 1 we included an .xls file with the raw data of all 134 measurements, the median and for the final 129 objects considered, we included principal statistics, correlation and covariance matrixes, a series of sheets with different grouping proposals and some further graphs; in addition, log-ratio values for further elaboration can be found.  Experimental … no analysis is better than the sample itself…[32], so extreme care is required for object selection, instrument calibration and measurement point selection. This is even more important when using partially unstable and semi-quantitative instruments or methods.
For ED-XRF analysis the measurement equipment was composed of an air-cooled X-ray tube with tungsten anode (0.6 mm focal spot, internal 3 mm Al filter, HV max. 50 kV and max. current 1 mA, although working conditions were 40 kV, 0.35 mA) and a cryogenic Germanium X-ray detector (EG&G ORTEC) with Beryllium window and a 195 eV energy resolution at the iron line.
Analysts (among authors) chose to abrade spots, up to 4 mm 2 , of the corrosion patina with an abrasive system before measurements in order to obtain data which was more representative of the bulk composition. Although micro-invasive, this technique allowed for measurements on many objects and produced a large number of element-concentration (%) data. As stated in literature, this method does not provide absolute, quantitative values but only relative percentages, although the use of reference materials with very similar alloy composition can produce semi-quantitative data acceptable for the purposes of this study.
The acquisition procedure was driven by a self-made computer software employing a simple graphic interface which helps in both the choice of acquisition parameters and the processing of spectra. The apparatus, built by two of the authors [33], was used in situ, placing it on a small movable wheelbarrow and the analysed point (position in which the exciting beam impinges on the bronze surface) was identified by a red laser pointer.
To convert files into a common format used as input by software for computation, the Calc package by LibreOffice (Libre Office 4, free from The Document Foundation, Germany) was employed; statistic computation was carried out by WinIdams (free by Unesco, FR), with MVSP (by Kowak Co., UK) and with Prof. Brereton's add-in for Excel for multivariate analysis (free by Bristol Chemometrics, UK).

Conclusions
This study was undertaken to answer a fundamental question posed by archaeologists: did ancient foundry- Figure 11 PCA scores in 3D representation (129 objects, 7 elements). Colour markers reflect object grouping by the age criterion (author G.G), so the chart is the combined 3D representation of Figure 10a, b and c. men intentionally change alloy composition of bronze objects in relation to their type of use? As suggested by several authors, re-using objects and metal alloys is not only a modern-day practice. Experimental evidence shows that during the late Bronze Age, recasting of broken or damaged items was already common. Expert foundry-men had also discovered that heat can be saved by using broken objects instead of minerals (as suggested today by the "soft landing" idea, by Embodied Energy or Emergy Concept). In this case, multivariate analysis using an explorative and visual method, failed to identify homogeneous "clouds of objects" and seems to confirm the previously stated thesis. Physical properties of bronze objects were therefore probably obtained by thermomechanical treatments rather than intentionally changing alloys' composition. Table 9 Clouds of objects recognised in almost one of the Scores projections but not associated to any of the proposed groups In more detail, although all the representations of the PC score space suggest a series of clusters, (shown in Table 9) none of the 5 grouping methods (see Fig. Y, AD, AH, AL, AP in the Additional file 1) suggest that there was a clear intention to change the percentages of the alloying elements for the investigated objects. The purpose of Table 9 is also to invite researchers to propose different grouping methods to verify whether there was a problem with variable definition in object grouping. However, our method does not seem to be erroneous as objects 1, 3 and 125, two axes and a casting residue, characterised by high Cu and low percentages of other elements, stay grouped in all projections.
Only the first one, on the upper left, is composed by pure Cu objects of different age.

EDA-EFA
This data-set is considered a good example for the use of Exploratory Data Analysis or Exploratory Factor Analysis. After some computation no grouping is evident.
Contrarily, if a true classification method was used (a supervised method like Confirmatory Factor Analysis or LDA), data could be "adjusted" in order to obtain a desired classification, but with the associated risk of overfitting.

Distributions
When cultural heritage objects are studied, a Gaussian or Gosset distribution cannot be considered obvious; this is shown well shown by the frequency distribution chart of elements contained in the Additional file 1. A non-parametric approach is therefore advisable, based on median, percentile, etc… for this reason the median absolute deviation from median (MAD) was used to study variations among measurements on the same object, while the percentiles were used for outlier checking.
MAD is robust in the presence of outliers, in contrast to the standard deviation which can be influenced by a single extreme value. Similarly, the interquartile range, or inter-percentile is robust versus outliers and can be used to detect an anomalous value.
The importance of using non parametric methods for all our calculations can be demonstrated by Figure 12, showing the frequency distribution of Sn and Pb for the 134 objectsclearly not Gaussian distributions.

Scaling
As already stated, scaling is a necessary but treacherous pre-treatment of raw values. In any spectroscopic method, pre-treatment, like a simple derivation, can strongly improve information extraction from data. There are hundreds of references in this field, for example that of Joliffe, one of the fathers of PCA, studying the distortion effect [25]. As aforementioned, we decided to do a log-ratio scaling. For this purpose, the column of As was cut away from the raw data matrix and the remaining one was inserted Figure 12 Frequency distribution chart of the elements Sn and Pb in the 134 measured bronze objects. As their distribution is not a normal one, it is difficult to identify outliers. However, many studies on the composition of ancient objects produce similar charts.