Megalithic monuments are iconic features of the European prehistoric landscape and are associated with Neolithic farming communities (5500-2000 BCE) along the Atlantic Façade, the North Sea, the Baltic Sea and the Mediterranean. France alone is home to over 4,500 dolmen structures, yet their precise function and significance remain subjects of ongoing debate.
In this project, conducted within the framework of the ERC Starting Grant project anthropYXX (StG2023-101117101) of Dr. Andaine Seguin-Orlando, we generate and analyze genomic data from Neolithic individuals across southern France.
In this seminar I will focus on the site ‘Dolmen de l'Ubac' (Goult, Vaucluse, France). This late Neolithic burial monument, dating to 3300-2700 BCE, contains primary deposits and disarticulated remains of approximately 40 individuals, 33 of whom were genetically analysed. These remains show no distinct archaeological patterns of gender or social status and were linked to four distinct phases of use of the dolmen as burial. The monument stands out as one of the few dolmen sites in Southern France with access to well-preserved human vestiges, offering valuable insights into late Neolithic burial practices in the region.
By integrating archaeological, anthropobiological, and molecular evidence, this research investigates the burial practices and social structures associated with this site. It seeks to uncover patterns of relatedness at different scales of analysis (parent-child relationships, identity-by-descent sharing patterns linking individuals across centuries, population-level admixture) and, by documenting who is buried in the monument, help explore how far burial practices may reflect social organization, including potential selection criteria based on age, sex, or social status.
In sum, this work combines laboratory-based DNA extraction, bioinformatic analysis, and archaeological interpretation to shed light on the identities and relationships of Neolithic individuals to explore their social worlds.
Computational analyses
1.1 Preprocessing
Ancient DNA (aDNA) presents unique challenges due to its highly degraded nature. The DNA that can be extracted from ancient samples is often limited and varies widely depending on factors such as temperature, soil conditions, and how the sample has been preserved over time. One of the major challenges in ancient DNA research is modern contamination. This can be computationally detected thanks to predictable post-mortem damage patterns, including fragmentation and deamination at the end of ancient DNA strands.
A total of 33 samples were processed, including petrous bones and teeth. This resulted in 18 high-quality shotgun-sequenced genomes, with whole-genome coverage ranging from 1X to 4X. In addition, 4 whole-genome capture (WGC) low coverage genomes and 11 Twist capture genome wide SNP data were produced, each with an average coverage of approximately 0.1X.
The sequenced reads were aligned against the human reference genomes hg19 and hs37d5 and subsequntly filtered for quality. The workflow included removing PCR duplicates to avoid potential biases, assessing characteristic ancient DNA damage patterns typically found at the ends of reads, and subsequently merging all libraries belonging to the same individual.
To assess data authenticity and verify that the sequenced reads originated from ancient individuals, DNA-damage patterns were explored and contamination levels were estimated.
Pseudo-haploid data was obtained by performing genotype calling. Hereby one out of the two alleles are randomly extracted at each genomic position from the aligned diploid sequencing data, allowing for standardized comparisons across low-coverage ancient genomes.
To improve genotype accuracy, imputation with Glimpse2 [2] was performed for samples with coverage above 0.35X (shotgun sequencing), i.e. an average of roughly one read every three base pairs across the whole genome.
1.2 Sex Identification and Uniparental Markers
To gain insights into the identities of the sampled individuals, their genetic sex was determined. Given the use of different sequencing techniques, three complementary methods for sex determination were applied. Additonaly the detection of chromosomal aneuploidies, such as XYY, X0, or trisomy 21 was performed.
Mitochondrial and Y-Haplogroups were inferred to explore maternal and paternal ancestry. A high diversity of both mitochondrial and Y-haplogroups was observed in the Ubac individuals.
1.3 Population Genetic Analyses
To explore the genomic makeup of the individuals, firstly a multidimensional scaling (MDS) analysis was performed, alongside principal component analyses (PCA). As a reference dataset, selected prehistoric individuals from the Allen Ancient DNA Resource (AADR) version 9.1 were incorporated, covering a temporal range from 8000 BC to 1500 BC across Europe. As a base, modern European reference populations from the same dataset were included. The distances in a PCA plot reflect genetic similarity between individuals or populations. Individuals that cluster closer together share more genetic ancestry, whereas those further apart are more genetically distinct.
Furthermore admixture analyses and F-statistics were employed, which can detect gene flow between populations, quantify shared genetic drift, and measure allele frequency differences, offering a quantitative framework to explore population relationships.
1.4 Genetic Relatedness
When addressing the question of who was buried in prehistoric monuments, a key aspect is assessing the genetic relationships among individuals buried within the same grave or burial context.
Uniparental markers (mitochondrial and Y-chromosome haplogroups) can provide preliminary clues about shared ancestry; however, sharing the same haplogroup does not necessarily indicate close kinship. To infer biological relationships more accurately, two software capable of detecting genetic kinship up to the third degree were employed, enabling us to draft a pedigree for individuals at the Dolmen de l'Ubac site. To further validate these relationships, ancIBD [1] was implemented, a tool that requires imputed data and enables the detection of identity-by-descent (IBD) segments, allowing the inference of relatedness up to the 12th degree. Given the increasing uncertainty in kinship interpretation beyond the 5th or 6th degree, the analysis was being limited to closer relationships that may be inferred to represent family structure.
To assess inbreeding levels, runs of homozygosity (ROH) were analyzed within individuals, offering insight into potential endogamy or consanguinity within the population.
Acknowledgements
This project has received funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (StG-2023, Grant agreement No. 101117101 anthropYXX) and from the French National Research Agency (ANR-22-CE27-0004, GenIn).
References
[1] Ringbauer, H., Huang, Y., Akbari, A. et al. Accurate detection of identity-by-descent segments in human ancient DNA. Nat Genet 56, 143–151 (2024). https://doi.org/10.1038/s41588-023-01582-w
[2] Rubinacci, S., Hofmeister, R.J., Sousa da Mota, B. et al. Imputation of low-coverage sequencing data from 150,119 UK Biobank genomes. Nat Genet 55, 1088–1090 (2023). https://doi.org/10.1038/s41588-023-01438-3