Checklist of the vascular plants of the Cantabrian Mountains

We present the first standardized list of the vascular flora of the Cantabrian Mountains, a transitional zone between the Eurosiberian and Mediterranean biogeographic regions in northwestern Spain. The study area comprises 15000 km2 divided in UTM grid cells of 10 km x 10 km, for which we revised occurrence data reported in the Spanish Plant Information System (Anthos) and the online database of Iberian and Macaronesian Vegetation (SIVIM). We used a semi-automatic procedure to standardize taxonomic concepts into a single list of names, which was further updated by expert-based revision with the support of national and regional literature. In the current version, the checklist of the Cantabrian Mountains contains 2338 native species and subspecies, from which 56 are endemic to the study area. The nomenclature of the checklist follows Euro+Med in 97% of taxa, including annotations when other criteria has been used and for taxa with uncertain status. We also provide a list of 492 non-native taxa that were erroneously reported in the study area, a list of local apomictic taxa, a phylogenetic tree linked to The Plant List, a standardized calculation of Ellenberg Ecological Indicator Values for 80% of the flora, and information about life forms, IUCN threat categories and legal protection status. Our review demonstrates how the Cantabrian mountains represent a key floristic region in southern Europe and a relevant phytogeographical hub in south-western Europe. The checklist and all related information are freely accessible in a digital repository for further uses in basic and applied research


Introduction
The Cantabrian Mountains are the westernmost mountain range of Europe, with elevations over 2,500 m above the sea level and a central axis running c. 170 km in parallel with the northern Atlantic coast of Spain ( Figure 1). These mountains originated in the late Mesozoic during the Pyrenean-Cantabrian Orogen of the Alpine orogenic cycle (López-Gómez et al., 2019). Nowadays, the Cantabrian Mountains are a transitional biogeographic zone between the Mediterranean and the Eurosiberian regions (García-Gutiérrez et al., 2018), and a crosswalk for plant diversity in the Iberian Peninsula (Kropf et al., 2002;Buira et al., 2017). Past climatic isolation and a very rugged terrain make these mountains a unique refugia for temperate trees (Roces-Díaz et al., 2018) and a center of endemic plants in the Iberian Peninsula . The Cantabrian Mountains are also among the highest-rated Important Plant Areas defined in Spain (Sánchez de Dios et al., 2017).
The biodiversity value of the Cantabrian Mountains has been largely recognized by the declaration of 13 UNESCO Biosphere reserves, more than 10 natural areas protected by regional governments, and the oldest National Park of Spain (Picos de Europa National Park, established in 1918 as Montaña de Covadonga). All these natural areas are currently integrated into the Natura 2000 European conservation network, forming a continuous array of mountain areas from which different river basins originate towards the north (Cantabrian Sea), south (Duero river) and west (Atlantic Ocean). The relatively well-preserved habitats of these BOTANICAL CHECKLISTS mountains support iconic species of high conservation concern, including endemic populations of brown bear (Ursus arctos), chamois (Rupicapra pyrenaica parva), broom hare (Lepus castroviejoi), iberian desman (Galemys pyrenaicus) and capercaillie (Tetrao urogallus cantabricus). However, the biodiversity of these mountains is also subjected to anthropogenic threats influencing forest fragmentation (García et al., 2005) and changes in the intensity and density of cattle (Blanco-Fontao et al., 2011). Despite the importance of the Cantabrian Mountains for European biodiversity, we lack reference floras or annotated lists for this region to be used in basic and applied research. The few available floras from nearby regions overlap only partially with the mountain range, since they are based on administrative units like Galicia (Romero-Buján, 2008), Asturias (Fernández-Prieto et al., 2014), Cantabria (Durán-Gómez, 2014) or Burgos (Alejandre- Saénz et al., 2006). At the national level, Flora iberica (Castroviejo et al., 1986(Castroviejo et al., -2021 reports the distribution of Iberian plants only at the administrative level, making it difficult to extrapolate floristic lists with a biogeographical basis. Another limitation for reporting the floristic diversity of the Cantabrian Mountains is that regional and national floras are heterogeneous in their taxonomic concepts. The lack of an updated floristic list for the Cantabrian Mountains thus prevents comparative studies with other mountain regions and the use of botanical data on conservation strategies.
In this study, we provide the first checklist of vascular flora for the Cantabrian Mountains, using a standardized taxonomy and an up-to-date nomenclature. By collecting information stored in national botanical databases, we develop a semi-automatic process followed by expert assessment to standardize names of native taxa at the species and subspecies level. To facilitate the use of the flora in comparative analyses, we also collect information about life forms, endemicity, IUCN threat and legal protection status of the referred taxa. To describe the evolutionary and ecological context of the regional flora, we provide a phylogenetic tree and a set of Ecological Indicator Values calculated under a standardized method. The digital version of these data is presented in an open repository to facilitate further studies of the Cantabrian flora at regional or supra-regional scales.

Study area
We established a grid of 155 UTM cells of 10 km x 10 km as operational geographic units to collect occurrence data for vascular plants (Figure 1). The grid was defined to include the whole Orocantabrian subprovince as defined by Rivas-Martínez et al. (2017) since this is the phytogeographic unit that better fits with the geographic conception of the Cantabrian Mountains. However, we note that the necessary use of the grid cells extends our study area to marginal zones of the Cantabro-Atlantic subprovince (in the north) and the Mediterranean region (in the south) as defined by Iberian phytogeography (Díaz González & Penas, 2017;Fernández Prieto et al., 2020). As defined here, the Cantabrian Mountains occupy 15,000 km 2 across four Spanish autonomous regions (Galicia, Asturias, Castilla y León, and Cantabria), containing all elevations of the Cantabrian range above 1,200 m asl, together with their proximal valleys ( Figure 1). Elevation varies from c. 10 m in the deepest calcareous valleys to 2,648 m of Torre Cerredo in the Picos de Europa massif. The longitudinal direction of the mountain central axis creates a geographic barrier between the Cantabrian Sea and the northern Iberian plateau, producing a strong climatic gradient from northern exposures influenced by wet oceanic winds, to southern slopes subjected to a continental climate (Díaz-González & Penas, 2017). The study area occupies c. 30% of the Cantabrian Mixed Forests ecoregion (Olson et al., 2001) along the southern border of the Atlantic biogeographic region in Europe (Cervellini et al., 2020).

Data compilation
We used the Information System of the plants of Spain (Anthos, 2020) and the Information System for the Macaronesian and Iberian Vegetation (SIVIM, Font et al., 2012) as primary data sources. Both databases provide a comprehensive review of botanical surveys documented for Spain and Portugal and they are expected to be more complete for the study area than international databases with broader scope (e.g., GBIF, www.gbif. es/). On the one hand, Anthos contains bibliographic observations of the Iberian flora (from publications, herbarium specimens, etc.) which are in most cases georeferenced to grid cells of 10 x 10 km, including updated observations or descriptions of Flora iberica (Castroviejo et al., 1986(Castroviejo et al., -2021. In a study comparing the representativeness of Anthos for the study area, Jiménez-Alfaro (2009) showed that this database has similar or better coverage of floristic information than regional databases and herbaria. On the other hand, SIVIM provides species data from phytosociological studies, which are mostly georeferenced to grid cells of 10 km x 10 km (Font et al., 2012). For the Cantabrian Mountains, SIVIM includes digitized information from vegetation surveys conducted in the last 60 years. Since the taxonomic concepts of SIVIM are not fully standardized, the names and taxonomic criteria may follow different interpretations by the original authors.

Taxonomical backbone
Both Anthos (www.anthos.es/) and SIVIM (www.sivim. info/sivi/) were accessed in April 2020 to collect the plant names reported in the study area. In total, we obtained 5,680 entries and 3,365 unique names that were edited in a spreadsheet and R software, v. 3.6.3 (R Core Team, 2021). Varieties and other infra-rank names were assigned to species and subspecies level when possible, deleting particles as "cf." or "agg". Hybrids and taxa described at the genus or family level in SIVIM were removed. We used a semi-automatic procedure (Wagner, 2016) to find a first taxonomical solution with a combination of automatic tools and manual editing. The automatic step consisted of extracting the rank and infra-rank names, looking at taxonomical solutions according to The Plant List (2013, v1.1) with taxonstand R package (Cayuela et al., 2012). The names were then manually checked with the Euro+Med Plantbase (Euro+Med, 2006) to find accepted names, with the support of the provisional names assigned to The Plant List (e.g. when checking synonyms or translations from original names with typos). The accepted names and the authorities provided by Euro+Med were then used to create a taxonomical backbone with (a) the original names provided in Anthos, SIVIM, or both; (b) the provisional taxonomical solution of The Plant List; and (c) the best taxonomical solution from Euro+Med. The names were preliminarily classified into two groups: (1) native taxa to the study area, including certain or dubious archaeophytes (i.e. plants likely introduced before 1,500 AD), and (2) non-native taxa, including cultivated plants, naturalized aliens (neophytes), and taxa supposedly reported by a misidentification or a georeferencing error. This first diagnosis was performed with the support of national or regional floras, providing a preliminary list of 2,799 unique names.

Expert-based revision
The aims of the expert revision were: (i) to include new taxa not reported in the original sources but with solid evidence of their occurrence in the study area, including recently described taxa; (ii) to revise whether the taxa were correctly classified as native or non-native, and (iii) to assess the final treatment for subspecies (as a rule, the nominal subspecies was specified only when there is evidence of the occurrence of another subspecies of the same species in the study area). The preliminary list was scrutinized by the authors based on their expertise in specific taxonomic groups or subregions within the study area. We also used floras recently published for overlapping territories (e.g., Romero-Buján, 2008;Alonso-Felpete et al., 2011;Fernández-Prieto et al., 2014;Durán-Gómez, 2014) and the websites of Anthos and SIVIM to double check the original records.
The revision was guided by the taxonomic and nomenclatural concepts of Euro+Med, but we accepted exceptions when a different treatment had a consensus among the authors. These exceptions included a few genus names that we considered not supported by phylogenetic data (for example, we kept Festuca instead of Patzkea and Bromus instead of Anisantha). In some cases, we accepted names that in Euro+Med were considered synonyms of other taxa (e.g., we accept Betula celtiberica while Euro+Med assigns it to B. pubescens var. pubescens). In general, we accepted all species and subspecies reported in the study area (e.g., in Alchemilla), but in few cases we avoided multiple subspecies accepted in Euro+Med (e.g., in Sideritis hyssopifolia) because many of them were difficult to verify as native or nonnative for the study area. Other species and subspecies not mentioned in Euro+Med referred to recently described taxa (e.g., Rivasmartinezia vazquezii). For the apomictic genera Pilosella and Hieracium, we used a recent review for the Iberian Peninsula (Mateo Sanz & del Egido, 2018) to identify "principal" (= basic) taxa (i.e., well established species or subspecies not subjected to introgression) and "intermediate" taxa which behaves as principal (i.e., those with spatial and ecological differentiation from the parentals and with autonomous reproduction, even if they had an hybrid origin). The rest of intermediate taxa (151 species and subspecies), representing local hybrids associated with the populations of the parent taxa, were discharged from the checklist but added to the Supplement information in the digital version of the checklist.
We discarded taxa reported by mistake in the study area. Many of these records were assigned to non-native species during the botanical explorations of the early 20th century, identifying local taxa with names from other parts of Europe or the Iberian Peninsula (e.g., many citations by Michel Gandoger reported in Anthos). Another relevant source of error was the presumably automatic assignment of geographic coordinates from the study area to localities with the same name located elsewhere in Spain -mere mistyping when entering the numbers in the original sources or in the compilations, though occasionally found, are a far less serious issue. Species with only one record that were never confirmed or supported by a specimen (e.g., Carex hostiana) were considered as non-native in the study area.

Species Information
We created a species-level phylogenetic tree of the native taxa using the ´V.PhyloMaker´ R package (Jin & Qian, 2019) with a mega-tree based on a combination of the trees in Smith & Brown (2018) for the seed plants and Zanne et al. (2014) for the lycophytes and ferns. In the case of the species that were absent from the mega-tree, we bound them to the genus-level basal node. Family names were based on The Catalogue of Life Partnership (2017) and represented in a word cloud using the ´worldcloud´ R package. For all species and subspecies, we further collected information about life form, endemicity, IUCN threat status, and legal protection. The most frequent life form for each taxon was collected from regional and national floras based on Raunkiaer (1934). Endemic status was updated from a previous checklist of endemic and subendemic plants of the Cantabrian Mountains (Jiménez-Alfaro et al., 2008), using three categories: (i) endemic to the Cantabrian Mountains (CM) as defined in this study, in most cases restricted to the Orocantabrian subprovince sensu Rivas-Martínez et al. (2017); (ii) endemic to the Cantabrian Mountains and the Mountains of León (CM+ML), for taxa distributed in the study area and the Montes de León, a nearby mountain region with strong floristic connections (Díaz González & Penas, 2017;Fernández Prieto et al., 2020); and (iii) endemics to the Cantabrian Mountains and the Pyrenees (CM+PY) based on the aforementioned references and the online Atlas of the Vascular Flora of the Pyrenees (http://florapirineos.ipe. csic.es). In all cases, a taxon was considered endemic when 99% of its populations were estimated to occur in the aforementioned geographic areas.

Ecological Indicator Values
We used Ecological Indicator Values (EIVs) to describe the ecological requirements (here described as the preferential niche of species in natural or semi-natural vegetation) of the native taxa included in the checklist. These indices were originally developed for Central Europe by Ellenberg et al. (1991), who established nine ordinal scores (1-9) for temperature (T), continentality (K), light (L), soil reaction (R), nutrients (N), and 12 scores (1-12) for moisture (F). We calculated EIVs at the species level, that is, different subspecies of the same species share the same EIVs. The only available reference for the study flora (Mayor López 1996) was developed for Asturias, covering our study area only partially. To provide an ecological assessment with a biogeographic context covering the Atlantic European biogeographic province, we combined the EIVs from (1) Mayor López (1996) for Asturias, based upon Ellenberg's criteria but with a 1-5 scale; (2) Julve for France (Julve 1998), who followed the original Ellenberg' scale;and (3) Roy et al. (2000) for the British Isles, who followed Ellenberg' scale but did not calculate T or K. To standardize these scales, we rescaled the values to 0-1 by dividing the original values by the maximum possible value of each EIV. Then we averaged the values for each taxon and EIV to obtain an initial standardized value. The resulting EIVs are thus expected to be comparable with similar indices used for floras of temperate regions in SW Europe.
We used the reciprocal averaging method (Chytrý et al., 2018) to fill values of the EIVs for the species of this checklist. To do so, we obtained vegetation plot data for Atlantic Spain (i.e. the Cantabro-Atlantic and Orocantabrian biogeographic subprovinces in Spain) from SIVIM (Font et al., 2012). The total number of available plots (relevés) was 12,457. We averaged the EIVs of the taxa present in each plot to obtain the plot-level EIV. Then, for each taxon, we averaged plot-level EIVs of all the plots where the taxon was present, obtaining the final EIVs of the taxa. Finally, we rescaled these EIVs to the original 1-9(12) scale of Ellenberg. We only kept the EIVs calculated with the reciprocal averaging method because they provide standardized values from species co-occurrences within the biogeographical context of the study area. Taxa with an initial EIV in the floras of reference but absent from the vegetation plots were left without EIVs. We finally calculated Spearman correlations between the final EIVs and those reported in the original floras to test whether they retained a similar ecological meaning.

Content of the checklist
The current version (V.1) of the checklist of the Cantabrian Mountains contains 2,338 native taxa, from which 1,996 were reported at the species level and 343 at the subspecies level. The nomenclature followed in 97% of taxa were the names accepted in Euro+Med (indicated as "+" in the column E+M, Appendix 1), while the remaining 3% (indicated as "-") were synonyms or names not considered in Euro+Med. We also identified 53 species and subspecies subjected to uncertainties on their taxonomical concept or native status for the study area. They were marked with the symbol "?" in the checklist and complemented with a brief synopsis in the supplementary file. This allowed us to apply an inclusive rather than exclusive criteria for taxa with doubtful taxonomical status in the Cantabrian Mountains (e.g. Festuca indigesta), for regional species not accepted in Flora Iberica (e.g. Veronica vadiniensis), or for species recently described as native to NW Spain which are supposed to occur in the study area, such as Alnus lusitanica (Vit et al., 2017) or Aira hercynica (Sáez et al., 2020).
In agreement with previous findings (Jiménez-Alfaro, 2008), our results support high sampling completeness of the regional flora, given the relatively low number of species not reported in the data sources. However, the number of taxa erroneously reported for the study area (Table 1) was unexpectedly high, representing 19% of the flora finally considered to be native. This may explain the overestimation of 3,000 taxa predicted in a previous attempt (Jiménez-Alfaro, 2009) based on all records reported in botanical databases but without an expertbased filtering. Our results suggest that, although Anthos and SIVIM are a good source for determining the flora of a given Iberian region, it is necessary to pay attention to errors produced by historical misidentifications or poor georeferencing of localities.

Taxonomic diversity and life forms
The checklist includes by 116 families, 680 genera and 2,223 unique species. The whole flora was dominated by angiosperms (96% of the taxa), followed by pteridophytes (3.8%) and gymnosperms (0.2%, with only eight taxa). Among the angiosperms, the Superasterids were the group with the highest number of species, followed by Superrosids and Monocots (Figure 2). From a total of 116 families, those with the highest number of taxa were Asteraceae (284 taxa), Poaceae (202) and Fabaceae (153), following the same pattern reported for the Pyrenees (Gömez et al., 2017) andSierra Nevada (Lorite et al., 2020). Other frequent families were Rosaceae, Caryophyllaceae, Brassicaceae and Apiaceae (Figure 3).
The most abundant life forms were hemicryptophytes, followed by chamaephytes and therophytes, although in this case relative frequencies were similar to those found in the Pyrenees, given the higher ratio of geophytes in Sierra Nevada (Table 2). We also found different proportions of life forms among endemic groups within the Cantabrian Mountains (Figure 4). Although therophytes were the second most represented life form in non-endemic taxa, they were much less represented among endemics.  (marked with "?" in the checklist). Despite this, the ratio of regional endemism is similar to the Pyrenees but lower to Sierra Nevada (Table 2), in agreement with the known patterns of endemicity in the Iberian Peninsula ). We also found 79 taxa which are subendemic to the Cantabrian Mountains, sharing geographic ranges with the Mountains of León (19 species and 9 subspecies) and the Pyrenees (35 species and 15 subspecies). These numbers are higher than those reported in Jiménez-Alfaro et al. (2008), likely because we characterized the whole regional flora, reducing the risk of overlooking taxa to be evaluated. The connections between the Cantabrian Mountains and the , one is endemic to the study area and the Pyrenees (Aster pyrenaeus), and one is endemic of Galicia and northern Portugal (Iris boissieri). All endangered taxa are protected in at least one of the legal catalogues analyzed, together with other species or subspecies with a legal status at the regional level (see Supplementary information). In total, 156 taxa were included in at least one of the regional lists of legal protection, from which 21 taxa were also included in the Spanish Catalogue of protected species. At the European level, 27 taxa were included in one of the Annexes of the Habitats Directive (92/43/CEE), three of them considered as a priority species for conservation (*) in Annex II: Aster pyrenaeus, Centaurium somedanum and Dryopteris corleyi.

Ecological Indicators
We calculated ecological indicator values (EIVs) for 1,890 taxa (80% of the flora). The EVIs were strongly correlated with those assigned to the same taxa for Asturias (Spearman´s rho = 0,67, P < 0.001, N = 1359), France (rho = 0,77, P < 0.001, N = 1,303) and the UK (rho = 0,69, P < 0.001, N = 953), supporting similar ecological interpretation with the original sources. The values for temperature (T), soil reaction (R), moisture (M) and light (L) were slightly skewed distribution to the highest values (Table 3, Figure 5). In contrast, the EIVs for nutrients (N) were skewed towards the lowest values. These patterns agree with the general trends found in other European regions (Roy et al., 2000;Guarino et al., 2012;Chytrý et al., 2018). In our study area, the EIVs values suggest relatively warmer conditions than the average of species evaluated in Central Europe and the British Islands. Although the EIVs for continentality (K) have been recently criticized (Berg et al., 2017), their variability reflects the climatic idiosyncrasy of our study area, with the lowest values likely representing northern areas subjected to strong oceanity, and a Gaussian distribution for most values.
Although we provide the first assessment of EIVs for the study region using a standardized method for comparative purposes, we note that our method may also provide spurious extrapolations to taxa poorly represented in the original dataset, and the ecological interpretation of individual taxa must be conducted with care.

Conclusions
This study provides the first standardized list of vascular flora for the Cantabrian Mountains. Without considering subspecies, the number of native species represents 40% of the 5,537 species reported for the Iberian Peninsula . As defined here, the Cantabrian Mountains are among the three mountain regions with the highest plant richness in the Iberian Peninsula, following the Pyrenees and Sierra Nevada. Unique characteristics of the Cantabrian Mountains are the proximity to the sea and the strong oceanic influence, explaining the relatively low diversity of gymnosperms found in the study area. Nevertheless, the Cantabrian Mountains are also a crossroad for taxa with different biogeographical optima (Jiménez-Alfaro et al., 2014), supporting relict populations of conifers from continental climates in the southernmost slopes (e.g., Juniperus thurifera) and warm-demanding oceanic species in the northern valleys (e.g., Culcita macrocarpa). Current climatic diversity is further supported by the wide ranges of EIVs detected for continentality (K) and temperature (T), providing multiple climatic niches that might partially explain the high number of Iberian endemics occurring in the region .
Overall, the data presented here contribute to strengthen the floristic knowledge of the Iberian mountains, as a basis for further research in systematics, ecology and conservation biology. We also highlight that, despite the high sampling completeness of the regional flora, a major issue for developing biogeographic reference lists in the Iberian Peninsula is the legacy of misidentifications or georeferencing errors from national databases. Using semi-automatic procedures combined with expert-based revisions seems therefore a suitable approach to solve such issues. Based on this approach, the checklist of the Cantabrian Mountains and the complementary data about non-natives will contribute to further floristic revisions and conservation initiatives in the study area. By using a continental-based authority like Euro+Med (and The Plant List), our checklist can also be used for comparative analyses with other mountain regions at the national and continental level, or for refining the distribution patterns of the Iberian flora in biogeographic units. To support further uses of the checklist, we provide a complete digital version in the open repository https://zenodo.org/record/5153297#. YUtTHWJBxEY to be updated routinely. García-Gutiérrez, T., Jiménez-Alfaro, B., Fernández-Pascual, E., & Müller, T. 2018