1 Introduction
Since 2001 (Gutierrez et al. 2001; Holden 2001a, Holden 2001b; Motkin 2001), the use of topographic airborne LiDAR data has become an essential part of archaeological prospection and landscape archaeology (e.g., Chase, Chase & Chase 2020; Cohen, Klassen & Evans 2020; Doneus, Mandlburger & Doneus 2020; Maio et al. 2023; Štular, Eichert & Lozić 2021; Štular, Lozić & Eichert 2021, 2023). The results have proven to be very effective in detecting archaeological features and have already dramatically changed our understanding of archaeological sites, monuments, and landscapes, especially in forested areas (recently e.g., Blatrix et al. 2022; Cuenca and López 2020; Doneus, Doneus and Cowley 2022; Dorison 2022; Fernández-Lozano et al. 2022; Filzwieser et al. 2022; Fontana 2022; García Sánchez 2018; Lieskovský et al. 2022; Lozic’ 2021; Snitker et al. 2022).
However, airborne LiDAR-derived information, or archaeological LiDAR, is too often used as an opaque, black-boxed digital method (for the concept see Dennis 2020: 212–213; Latour 1999: 70,130, 183–185,191–193, 304). A common occurence of this is when LiDAR data derivatives such as digital elevation model (DEM) visualizations, are accepted as facts rather than ‘facts’ sensu Clarke (1978: 9–27). In other words, visualisations are perceived as hard data that objectively reflect the passive subject of research, which is archaeological landscape. However, these visualisations are the result of complex data processing full of knowledge-based subjective decisions that must be comprehended and accounted for in archaeological interpretation (Briese et al. 2013; Doneus & Briese 2011; Doneus, Mandlburger & Doneus 2020; Lozić & Štular 2021; Opitz 2013; Lozić & Štular 2020).
We have argued in the past that a move towards theoretically aware, impactful, and reproducible research is needed. This can only be achieved through breaking the black box, which in turn will promote the transition of airborne LiDAR archaeology from a specialist discipline within archaeology to a background method for all aspects of landscape and environmental archaeology (Štular, Eichert & Lozić 2021).
Similar drawbacks have been noted for archaeology in general: The significance of spatial data in decision-making processes is underrepresented, there is a lack of published reference datasets from historic environments, and the spatial data infrastructure necessary to realize the full potential of cultural heritage data potential is lacking (McKeague et al. 2019, McKeague et al. 2020).
To date, the efforts in archaeological LiDAR have primarily targeted domain specialists by introducing improvements in data processing (e.g., Guyot, Lennon & Hubert-Moy 2021; Mazzacca et al. 2022; Storch et al. 2021; Štular, Eichert & Lozić 2021; Štular & Lozić 2020, 2022; Štular, Lozić & Eichert 2021; Toumazet, Simon & Mayoral 2021) and, less frequently, in the documentation of the data processing workflows (Doneus, Banaszek & Verhoeven 2022; Lozić & Štular 2021).
Perhaps an even greater challenge, however, is the dissemination of archaeological LiDAR data to the archaeologists who are not LiDAR specialists. In a recent paper the lack of widespread scientific dissemination of archaeological interpretations of airborne LiDAR data has been identified as one of the major bottlenecks. Currently, we, as domain specialists, are not able to provide archaeologists with the means to critically interact with archaeological LiDAR information. As a result, the full potential of archaeological LiDAR to impact the way we do archaeology cannot be realised (Štular 2022).
Why? Currently, scientific dissemination of archaeological LiDAR almost exclusively uses the predominant formats of scientific publications: online journal articles, printed books and hybrid publications.
However, the current scientific publication landscape does not meet the requirements of archaeological LiDAR. It relies on the distribution of texts that may be supplemented with images and/or formulas. Portable Data Format (PDF), Extensible Markup Language (XML), and Electronic Publication (EPUB) are the primary formats. In certain scientific fields, such as archaeology, printed books and/or journals continue to be essential. Numerous publishers permit the attachment or appendix of various file formats, such as video and audio files, presentations, and spreadsheets. In addition, the open science movement encourages the publication of relevant datasets in persistent repositories that are then linked to journal articles (e.g., Chandre & Dubois 2021).
This state of affairs is suitable for most scientific fields, but not for all. One of the exceptions that interests us here is map-reliant science, that is, the scientific fields that rely heavily on maps or even on densely populated, large-scale maps. Archaeological LiDAR is such a map-reliant science.
A typical example of current practise in the publication of archaeological LiDAR is the publication of a small-scale map depicting the research area and one or more large-scale map windows depicting specific archaeological features. This will relate the general sense of the space and the types of archaeological features, but the reader has no means of verifying the features throughout the study area. So the reader can either believe or not believe that the reported content is factual, i.e., the content cannot be validated and/or replicated. And a scientific article based on belief is not good science (Figure 1). Therefore, in this case, the gap between the map scales is the semantic gap between belief and scientific evidence.
Bridging the semantic gap. Typically, the archaeological LiDAR is published as a small-scale map (left) and large-scale map window(s) (right); the reader has no means of inspecting the relevant details of the full research area. Here, Kostanjevica na Krasu (Slovenia) case study area is shown (see section 4 for para- and metadata).
This is a component of a larger problem in science, namely the disentanglement of data. When research results are separated from the underlying data that serve as evidence for interpretation, numerous issues arise. These include irreproducibility, lack of reusability, and wasted effort in collecting new data, proliferation of unmanaged versions, and insufficient incentives for data sharing. Better linking of data to publications, on the other hand, has the potential to enable new forms of scientific publishing, promote interdisciplinary research, strengthen the link between policy and science, and reduce the cost of replicating research.
For this reason, the journals Science and PLOS ONE, for example, require that the underlying data be deposited. Thus, as indicated, the issue is not confined to archaeological LiDAR. In the social sciences, the vast majority of datasets generated by funded research are never deposited or shared (Altman & Crosas 2013), and we believe that this is also the case in archaeology.
In the current publishing landscape, authors of archaeological LiDAR are left with two options: publishing large-scale maps as appendices or depositing GIS data in repositories.
The first option is limited by size. For example, a small case study shown here is six kilometres by two kilometres. The appropriate scale for displaying it at what is known as 100% crop size is 1: 2,000, which would make the map three metres wide. Publishing a 3 metre wide map in digital format is possible, but it is impractical to use it on a screen.
The second option, which allows one to deposit GIS data, is of limited value, as it can only be used by readers who are proficient in GIS. But even for them, setting up a new GIS project with multiple layers is time-consuming and rarely done for “someone else’s” data.
In a tentative database of nearly 500 scientific articles on archaeological LiDAR (https://angkorlidar.org/bibliography), we could find only one example with a map in an appendix and none with published data. One of the reasons for this is probably also the fact that only about 1% of readers of archaeological papers are likely to engage with the appendices and/or the data deposited (Figure 2).
Engagement with archaeological scientific papers. Red: 20 most cited archaeological papers in PLOS ONE;1 blue: our recent relevant paper.2
Outside of the currently accepted scholarly publishing formats, there are solutions that have been tailor made for digital storytelling with maps (e.g., Lozić & Štular 2022a). However, these solutions neither have sufficient persistence built in nor offer scientific recognition.
Similar problems arise in other scientific fields. A prominent example from the field of computational sciences is the simulation-based research that has informed much of the UK’s policy decisions on Covid-19. The research was heavily criticised because the underlying simulation code was initially inaccessible and incomprehensible (Boland and Zolfagharifard 2020).
Some solutions to this problem have been proposed, among them executable paper, reproducible paper, and notebook article.
In an executable paper, the reader should be able to reproduce each step that was taken to reach the conclusions, from the raw data to the polished plot. A fundamental part of creating an executable paper is, therefore, making all material, from the paper itself to the data, available on an accessible platform for readers. The executable paper itself should be linked to a repository containing the raw data. This is achieved by formatting a scientific publication as dynamic software that combines text, raw data, and the code used for analysis, with which the reader can interact (CodaLab n.d.; Lasser 2020b, 2020a). Executable paper is, for the purposes of this paper, indistinguishable from an executable research article, which is a concept and an open-source suite of tools that provide a web-native format for creating computationally reproducible papers (Tsang & Maciocci 2020).
The reproducible paper can, for our purposes, be understood as a stripped-down version of the executable paper. It focuses on a simulation code that the reader can execute to generate data and perform its analysis, thus reproducing the entire research process up to the results (Akhlaghi 2018; Kuttel 2021). Reproducible paper is supported by a handful of computer science and mathematics journals.
Similar, but much broader, is the concept of a notebook article. A notebook article is defined as a basic dissemination unit for nonlinear science. It should be a single document, uniquely and permanently identified with a digital object identifier. It must contain all characteristics of a standard research article that are generally accepted as the best practices for scientific dissemination. It must be accurate, well documented, robust, persistent, and easily accessible. The user should be able to automatically generate a printable document such as a PDF, but also run the document on a cloud computing platform to check and reproduce the results (Chandre & Dubois 2021).
In conclusion, all solutions strive to achieve the Open Science objectives of creating an accessible, transparent, and reproducible method of research communication. The notebook article is the most thorough, the reproducible paper is the most practical, and the executable paper falls somewhere in between. Only a handful of publishers have considered any of the three. Moreover, they are all solutions developed by programmers for programmers, i.e., within computer science. However, most researchers, including the vast majority of archaeologists, are not computer scientists.
In archaeology, there is indeed a field that deals with very similar challenges: The application of 3D modelling to cultural heritage. Appropriate solutions have been sought for more than a decade (e.g., Barnes et al. 2013; Counts et al. 2020; Opitz & Johnson 2016; Potenziani et al. 2015; Štular & Štuhec 2015) and the current state of the art is journals such as Digital Applications in Archaeology and Cultural Heritage, which disseminate “original” articles as PDF files with embedded hyperlinks to 3D models available as a cloud-based service.
We believe that map-reliant science shares the need for openness, reproducibility, and persistence with executable paper, but it must adapt to its users following the example of 3D modelling. The aim of this paper is to propose a possible solution. Following the accepted term executable paper, we have termed our solution Executable Map Paper (EMaP). We will present a proof of concept in the form of a demonstrator for EMaP that meets the requirements of archaeological LiDAR. However, we believe that the same approach can be used in many other scientific fields, for example, remote sensing, archaeology, and geology.
As such, this article contributes to the transparency of research, accelerates dissemination, and promotes the reuse of scholarly data. We would like to stimulate debate on the crucial role of digital technologies in archaeological LiDAR research and promote their theoretically informed and interdisciplinary use.
2 Reproducibility and accessibility in archaeology
The problem of effective dissemination of spatial data in the context of scientific publications is not unique to archaeological LiDAR. To the contrary, this is a long-standing issue not only in archaeology but also in other fields such as geography and geology and is closely intertwined with a wider issue of reproducibility and accessibility.
Scientific fields beyond archaeology are not the subject of this paper, other than to note that the same problems are being addressed, but no widespread solution has been adopted (e.g., Lombardo, Piana & Mimmo 2018; Sudaryatno, El-Yasha & Nur’aini’Afifah 2019; Zhang et al. 2007; Zhou et al. 2016).
However, for a better understanding of the EMaP, the archaeological context is presented. The publication of geospatial data touches on broader issues in archaeology in a variety of domains, including the sustainability of digital data repositories, data accessibility and reliability, standardisation of data formats, property management and ethical concerns.
Data dissemination and data archiving are intricately intertwined. The SHARE IT project, whose goal was to develop a strategy for archiving and disseminating spatial archaeological landscape data sets, can serve as a starting point for describing the recent development. Identifying suitable digital archiving strategies, data formats and standards, metadata requirements, international framework integration, and copyright and access policies were among the research challenges. The project emphasized the significance of international standards-compliant archival systems, which can only be constructed with the correct use of metadata, controlled vocabularies, the definition of preferred data formats, the need for comprehensive copyright and access policies, and cost models for implementing archival strategies (Shaw, Corns & McAuley 2009).
The protagonists of the European projects ARIADNE and SEADDA built upon this.
The project Advanced Research Infrastructure for Archaeological Data Networks in Europe (ARIADNE) started in 2013, run as ARIADNEplus between 2019 and 2022 and it continues its activities as an association ARIADNE RI AISBL (https://ariadne-infrastructure.eu). This project made significant progress toward its objective of creating a new research infrastructure for archaeological data networks in Europe by combining and linking existing archaeological datasets into a single research infrastructure. Simultaneously, it promoted a culture of free data access and reuse (Aloia et al. 2017; Richards & Niccolucci 2019; Štular, Niccolucci & Richards 2016). While ARIADNE was successful in providing a single point of access to (mostly) European archaeological datasets, spatial data was not its focus, and consequently no spatial-data-focused infrastructure was developed.
Saving European Archaeology from the Digital Dark Age (SEADDA) is an ongoing cooperation project (https://www.seadda.eu). Its top priority is to make archaeological data open and freely accessible, while demonstrating that the field lacks suitable persistent repositories.
A comprehensive overview of the current state of European archaeological data archiving is one of the major accomplishments of the SEADDA project to date (Jakobsson et al. 2021). In the past decade, innovation has cantered on making archaeological data more interoperable, both to increase data discoverability through integrated cross-search and to facilitate knowledge creation by combining data in novel ways. The emerging research challenge of the next decade will be optimizing archaeological data for reuse and defining what constitutes good practice regarding reuse. In the future, archaeology will require not only improved data curation policies, but also the harmonization of data creation and archiving processes (Richards et al. 2021).
However, the European (and global) landscape of digital archaeological data curation consists of “haves” (e.g., Hollander 2021; Jakobsson 2021; Nicholson, Fernandez & Irwin 2021; Novák, Kuna & Lečbychová 2021; Richards 2021) and “have-nots” that produce relevant data but struggle to obtain funding to build and maintain bespoke databases with online access (e.g., Kreiter 2021; Oniszczuk & Makowska 2021; Štular 2021). Inequity in access to a persistent and adequate archive or repository of digital data extends to access to the digital infrastructure required to create, curate, and utilize research data to its fullest extent (Corns, Kennedy & Štular 2015; Lozic’ & Štular 2023; Wright 2018).
Importantly, the Infrastructure for Spatial Information in Europe directive, which offers limited guidance for archaeological datasets centered on spatial data (INSPIRE 2007). It was used as a foundation to propose the formalized management of primary research data via an archaeological spatial data infrastructure that would deliver more efficient data with broader benefits, such as harmonizing and publishing spatial data according to consistent standards (McKeague et al. 2019, 2020).
Another challenge is geospatial Big Data. The term refers to datasets with location information that exceed the capabilities of commonly available hardware, software, and personnel. While current datasets are still manageable, archaeology faces the same challenges as other disciplines, particularly in terms of data quality and privacy concerns. These data will have a significant impact on areas such as cultural history writing, decision-making, and visualization of the past. To maintain scientific and ethical data consistency, it was proposed to include a quality report with each dataset (McCoy 2017).
In this context, the significant recent advances in the field of archaeological 3D GIS (Dell’Unto & Landeschi 2022) should be mentioned, as significant synergies with archaeological LiDAR can be expected in the near future.
Before open data sharing in archaeological LiDAR can occur, ethical considerations must be taken into account. In addition to, or in contrast to, the scientific benefits of making data accessible, the means for engaging with local communities and public education, as well as the role of stakeholders (e.g., the potential for damage to the archaeological record), must be considered (Chase, Chase & Chase 2020; Cohen, Klassen & Evans 2020; Fernandez-Diaz & Cohen 2020).
In conclusion, we can only reiterate the findings from a recent review of the current state of spatial data management in archaeology. Technical solutions exist, but a long-term transnational strategy is required to deliver on the promise of open and sustainable spatial archaeological data for all user groups (McKeague et al. 2019). Such solutions take years to materialize, so ARIADNE, SEADDA, and other initiatives are merely a starting point. In this context, the EMaP proposed in this article can be viewed as an interim solution that can be immediately implemented within the existing ecosystem.
3 Executable Map Paper: Content Specifications
We have based the content specifications for EMaP on the executable paper briefly presented above. Here, we look closely at the content specification of an executable paper. According to Lasser (2020a; 2020b) it should be able to meet five demands.
First, the paper should display nicely formatted text, including references and links, just as in a traditional journal article.
Second, it should display figures, diagrams, maps, videos and interactive elements.
Third, if applicable, it should display the code used to create such diagrams from the data.
Fourth, if applicable, it should display the interpretation of this code and user input in the case of interactive elements.
Fifth, the application should be built entirely from open-source components and hosted under a free licence so that it can be easily shared and reused.
In principle, our definition of the EMaP adheres to the above where applicable. The sole exception is a part of the fifth demand, which states that the executable paper must be built entirely from open-source components.
An open-source platform is of course highly desirable, but it is unrelated to the stated aims of the executable paper. Most importantly, it is not feasible. In the current scientific publishing landscape, the most prominent open-source platform, Open Journal Systems, is used mainly by small publishers. However, the landscape is dominated by a handful of large publishers that operate proprietary closed-source platforms. Widespread adoption of open-source platforms is therefore not currently the case and is unlikely to occur in the foreseeable future.
In addition to the above definition of the executable paper, we would like to emphasise the importance of scientific recognition, which is only hinted at in the first demand. Scientific recognition is the key component of public science (that is, non-commercial science in the sense that researchers are not directly employed by for-profit companies). In public science, funding and researchers’ careers are directly related to scientific recognition. One cannot exist without the other.
Therefore, scientific recognition is a prerequisite for any scientific work, including executable papers. Since scientific recognition is measured by well-established and frequently conservative metrics (Bollen et al., 2005; Pendlebury, 2009) that only consider established publication formats, it is crucial that any executable paper adheres to or “mimics” such a format. The above-mentioned story map, for instance, is scientifically unrecognized regardless of its content.
Based on the presented demands for executable papers and on our own reservations and additions, we have constructed requirements for EmaP. EmaP must provide (text in italics pertains specifically to the archaeological LiDAR):
- A journal-article-like frontend that displays formatted text including figures, charts, references, etc. in at least a PDF format (XML and EPUB may be offered in addition);
- an hyperlinked service for feature-rich interactive maps (displaying at least large-scale enhanced visualisations of LiDAR-derived digital feature models and a geodatabase of archaeological features);
- metadata and paradata used for the creation of interactive maps (for example, followingLozić & Štular 2021);
- means for easy sharing (for example, an intuitive geoservice) and reuse (for example, the full dataset deposited in a trusted, persistent repository).
4 Executable Map Paper: Technical Specifications
4.1 Persistence of hyperlink
Therefore, the above requirements for EmaP are based on the hyperlinks embedded in the PDF. Hyperlink technology has been available in PDFs for decades (ISO Standard, 2020), but it is not widely used in scientific publications. The main obstacle is the persistence of the hyperlink, or rather, the lack of one. Published scientific articles are supposed to be available in an unaltered WYSIWYG (“what you see is what you get”) form “forever”. After some two decades since online-only and online-first publications became abundant, we have yet to hear of a significant repository of scientific articles being discontinued and there are archiving systems in place such as CLOCKSS (Controlled Lots of Copies Keep Stuff Safe) and LOCKSS (Lots of Copies Keep Stuff Safe).
A hyperlink is much more ephemeral than a repository of PDFs. A URL may change as servers are migrated, a web service may become obsolete, or the content may simply not be updated to work with ever-evolving Internet security protocols. Anecdotally, the lifespan of a web service is measured in years at most.
What hyperlink technology lacks to become viable for scientific publication, thus, is persistence. There are (at least) two ways to improve this: using a persistent backend or a persistence layer.
The solution with a persistent backend could consist of a user-facing frontend, a user-accessible interactive map, and a backend that runs the interactive map (Figure 3). The persistence in such a setup would be achieved through a persistent backend repository that is able to host the hyperlinked feature-rich content, in our case, a sandboxed geoservice. There are technical solutions for sandboxing a geoservice, but they need to be custom developed and are relatively expensive to maintain, especially for the data hungry archaeological LiDAR. Any long-term viable solution for EMaPs, on the other hand, needs to be cost-effective so that it can be provided to the user for free or at an acceptable one-off price (along the lines of the established Diamond or Gold Access standards for publishing). Without going into details of how science and scientific publications are financed, it is obvious that a system that would include annual maintenance fees for a scientific paper after it has been published is not feasible. In addition, it would require changes to the publishers’ digital infrastructure that are unlikely.
The EMaP system with a persistent backend. It consists of a user-facing frontend, an interactive map accessible by users, and a backend (geoservice) that runs the interactive map.
The solution with a persistence layer can be modelled by the approach successfully used by the DOI® system, among others. It is based on the same user-facing frontend and a user- accessible geoservice. The latter does not need to be sandboxed and can be hosted anywhere, as persistence is achieved by adding a persistence layer between the two.
The solution with a persistence layer is, in our view, the only viable solution that can be deployed immediately, because it doesn’t require any changes in the existing publishing infrastructure. It is described in more detail below (Figure 4).
The EMaP system with a persistence layer. Persistence layer is providing the communication between the PDF frontend and feature-rich service; if the service is replaced or moved, the persistence layer is updated accordingly so that the hyperlink will always resolve in the intended service.
4.2 Frontend
The user-facing frontend must be based on a PDF format. This is by far the most widely used format for scientific publications (e.g., Chandre & Dubois 2021), with XML and EPUB a distant second and third. The entire scientific publishing landscape is based on the PDF format, and the industry is just completing the investment cycle for the transition from print to digital publications. As the industry is currently facing another transition – from a reader-pays to an author-pays model (so-called open access) – it is unlikely to invest in new file formats. Moreover, most researchers have invested a lot of effort in personal bibliographic databases that integrate publications in PDF format. Therefore, changing the format would alienate publishers and readers. This in turn would have a negative impact on the scientific recognition of such a publication, which would alienate the last party in this three-way transaction, the authors.
One of the strengths of the PDF format is its versatility. It allows, among others, to embed (relatively simple) interactive map or 3D content. However, such a “feature-rich” PDF is only available if the proprietary Adobe software is used by both the creator and the reader. Therefore, these advanced features are not supported by most publishers and are not popular with the readers, which we will illustrate with three examples.
The first two examples deal with 3D models. One of the most widely read articles on embedding 3D content in scholarly publications was published in 2013 by what was then the largest scientific publisher in the world (Barnes et al. 2013). The article, which described the methodology for embedding 3D content in PDF files, was published in PDF (and XML). However, the feature-rich PDF file with embedded 3D models was only provided as an appendix, that is as a reproduction of the “original” article. Although the content was a demonstrator of the technology, 87% of readers only viewed the online content, 13% downloaded the “original” PDF, and 0.5% engaged by saving the content; feature-rich PDF was therefore accesses by at most 13% of readers, but the number is more likely closer to 0.5% (the metrics were accessed on 18 November 2022). This clearly shows the aversion of both publishers and readers to anything that is not the “original” paper, regardless of its features.
The second example is our own attempt to introduce an alternative file format for embedding 3D models in a scientific publication in iBook® (Štular et al. 2013; Štular & Štuhec 2015). The content only gained traction in the scientific community when a feature-poor PDF “offprint” of the original was provided on academic social media. We explain this by the limited accessibility of the proprietary media.
The final example that illustrates the need for the PDF format as a frontend is the above-mentioned executable research article. However, it is only created in addition to the original paper (Tsang & Maciocci 2020). This makes it a type of appendix, which, as shown above, has limited appeal. Therefore, despite its potential to offer a new level of transparency, reproducibility, and interactivity, we believe that solutions based on web-native formats have relatively little chance of wider acceptance in the scientific publishing landscape.
Based on these examples, we believe that the EMaP must be perceived by the publishers and the readers as a “normal” PDF file format. Thus, the only option to introduce feature-rich content is via hyperlinks, as is the course of development with 3D models. We believe that a geoservice accessible via persistent hyperlinks in PDF can provide the desired seamless interaction with feature-rich scientific content (an interactive map) that is acceptable to both publishers and the research community. Any hyperlink-based solution can also be easily implemented in XML and EPUB formats.
4.3 Persistence layer
We propose that the persistence of the hyperlink embedded in PDF is achieved by a persistence layer. Its task is to act as an intermediary between the PDF (frontend) and a geoservice (which in this case acts as the backend), thus enabling the hyperlink to always resolve to the intended feature-rich content, even if the service has been moved or changed.
There are many possible technical potential identity resolution systems – e.g., see an overview that is also relevant for our needs (Ren et al. 2020) – and we have tested two: a controlled pointer (URL) and a handle.
Technically, the simplest solution consists of a controlled URL pointer. The pointer resolves in a trusted server, and it is programmed to redirect the request to the service. If the service changes/moves, the original pointer is simply rerouted to the new service.
The handle is a more advanced solution. We have opted for the industry standard solution, the Handle System® (DONA Foundation n.d.). It is a distributed computer system which stores names, termed handles, of digital items and that can quickly resolve those names into the information necessary to locate and access the items. It is a general-purpose global system for the reliable management of information on networks such as the Internet over long periods of time. One of the best-known users of this system is the International DOI Foundation, which is the governance and management body for the DOI® System. They state that “…the Handle System® … (is) the best infrastructure component available today for managing digital objects” (DOI Foundation, 2017).
The Handle System® was developed as a resolution system for digital objects and serves as a level of indirection to any type of current state data that one wishes to associate with the object via the identifier resolution mechanism. The Handle System® provides a way to use DNS and URLs for identifiers and, at the same time, an identifier that can be resolved without DNS and URLs if one chooses to use it that way. Most uses involve DNS, either as a way to get common web browser clients to communicate with handle servers (for example, https://hdl.handle.net/20.500.12102/ariadne-plus-pilot), or as the current status data returned by that resolution (for example, https://ariadne-plus-pilot.zrc-sazu.si).
The handle system provides a technical infrastructure: a resolution service, shared by all implementations of the system. Its protocols ensure consistency and interoperability for resolution purposes between a variety of implementations. At the application level, consistent rules are not necessarily to exist across multiple applications. The system licence does not include ongoing technical support, and the system is usually installed and managed by the user’s technical staff, which does not facilitate its use by the public. The technology allows relationships or multiple resolutions to be expressed. For example, one entity can be resolved into multiple other entities; this can be used to represent a parent-child or similar relationship. The system is maintained and upgrades to the global general purpose naming service are provided (Foundation 2017, n.d.; Ren et al. 2020).
Therefore, the handle system provides persistence, consistency, technical infrastructure, semantic interoperability, active development, and independent governance. In practice, if the landing page changes, the author/caretaker/archivist simply has to make the necessary changes to the metadata. However, the implementation of these changes are the responsibility of the entity that created the handle (e.g., the author of the geoservice) and not of the system. In other words, the Handle System® technology provides persistence only if it is used with appropriate social infrastructure:
“Persistence is a function of organizations, not of technology; a persistent identifier system requires a persistent organization and defined processes.”
We have implemented a handle system solution for an archaeological LiDAR example that can be embedded in EMaP. The handle “20.500.12102/ariadne-plus-pilot” consists of the ZRC SAZU’s prefix “20.500.12102” and the object identifier “ariadne-plus-pilot”. The handle can be used with the Handle.net® web form, which will resolve individual handles and view their associated values. However, the most common use of the handle is as a URL using the Handle proxy: “https://hdl.handle.net/20.500.12102/ariadne-plus-pilot”. If the existing geoservice cannot be maintained anymore, the landing page can be redirected to the new one or, in the worst case, to data deposited in a persistent repository.
However, as described above, the implementation of a handle is no more persistent than a pointer, regardless of technical sophistication. It is maintained only as long as the institution/author deems the effort attainable. In practice, this particular handle is just as likely or unlikely to be actively maintained beyond the author’s professional engagement with the institution as the pointer.
The take-home message is that persistence is a function of organisation, not technology. Even if the DOI system would be used for persistence layer, the same problem remains (most users are not aware that the responsibility for the persistence of a DOI record lies with the DOI registration agency, for example, the publisher of a journal). In this case, there is no practical difference between a pointer, a handle system, or a DOI system.
4.4 Backend
We tested four available geoservices for the backend: D4science, GIS Cloud, QGIS Cloud, and ArcGIS Instant App.
D4Science is currently offering a beta version of a geoserver it is developing in collaboration with ARIADNEplus. The service, called GeoNa Prototype (https://ariadne.d4science.org/web/geona-prototype), is, as the name suggests, not yet ready for production use. It lacks documentation on how to use it, a bulk import of thousands of POI’s, and the ability to add custom base layers. The latter is crucial for archaeological LiDAR and the most difficult to implement due to the relatively large datasets (over 0.5 GB for our case study).
GIS Cloud (https://www.giscloud.com) and ArcGIS Instant App (https://www.esri.com/en-us/arcgis/products/arcgis-instant-apps/overview) are commercial and closed source services, while QGIS Cloud (https://qgiscloud.com) is open source. However, for the use intended here, all three services require a monthly fee. They differ in cost and specific application scenarios, but in general they offer similar features for similar price and are all suitable for our purposes. We chose ArcGIS Instant App for reasons practical to us.
5 Results: Executable Map Paper Demonstrator
Producing a full EMaP is beyond the scope of this article for an obvious reason: The social and/or institutional infrastructure, that is, a scientific journal, to publish such articles, does not yet exist. Instead, we will demonstrate the technical solutions for all four main requirements of EMaP.
The first requirement, the PDF frontend, is represented by this article.
The second requirement, a hyperlinked interactive map, is represented in Figure 5. As described above, we have implemented two solutions, pointer and handle.
The pointer is the following URL: “https://ariadne-plus-pilot.zrc-sazu.si”. It is hosted on the server maintained by ZRC SAZU, a public research institution that employs the authors; the institutional context provides a reasonable assurance for persistence. The pointer is programmed to resolve in the geoserver we currently use (https://zrc.maps.arcgis.com/apps/instant/sidebar/index.html?appid=6eb94108f66f40058f2dc240b6424531) and can be easily reprogrammed to resolve in a new service if needed.
Kostanjevica na Krasu (Slovenia), interpretation of archaeological features derived from LiDAR data (see text for para- and metadata). Interactive map is available at https://ariadne-plus-pilot.zrc-sazu.si.
The handle solution is provided through the Handle System®. The handle “20.500.12102/ariadne-plus-pilot” is, as mentioned, also accessible via URL “https://hdl.handle.net/20.500.12102/ariadne-plus-pilot”.
The third requirement concerns the meta- and paradata needed to reproduce each step in the creation of the hyperlinked map. There are no universally accepted meta- and paradata standards for archaeological LiDAR, but we have recently proposed a schema that can be used in its place. In fact, the meta- and paradata associated with this demonstrator are published in the aforementioned article as an example (Lozic’ & Štular 2021: Tables 3 and 4).
The fourth requirement relates to sharing and reuse tools. This article’s free availability facilitates its dissemination. We deposited the data in appropriate GIS formats in the reputable, persistent repository Zenodo so that they may be reused (Lozić & Štular 2022b).
6 Conclusions
As mentioned, archaeological LiDAR has become an essential part of archaeological prospection and landscape archaeology, but LiDAR data are all too often used as an opaque, black-boxed digital method. One of the challenges in breaking the black box is appropriate scientific dissemination.
Current practise is to publish a small-scale map of the research area and one or more large-scale map windows showing specific archaeological features. This conveys a general sense of space and the nature of archaeological features, but the reader has no way to corroborate the features in the entire research area. To overcome this challenge, this article introduced the concept and demonstrator of EMaP.
First, we defined the four requirements for EmaP: PDF frontend, hyperlinked interactive map, meta- and paradata, and easy sharing and data reuse.
We then presented two technical solutions for EmaP, one based on a persistent backend and the other on the persistence layer. As LiDAR is very data intensive, a persistent backend is currently not a viable solution for widespread adoption in scientific publications.
Thus, we have further explored the solution with a persistence layer. The way it works is that the metadata in the persistence layer is adjusted if the service running the interactive map is changed/moved. Thus, the hyperlink embedded in the frontend will always resolve to the intended service.
There are numerous technical options available for implementing a persistence layer, and we tested two: a controlled pointer (URL) and a handle. Although the handle system is a vastly more technically advanced solution than a pointer, it shares the same fundamental flaw: persistence is a function of organization, not technology. Even if the DOI system was utilized for the persistence layer, the same issue would persist (a fact lost to most users of DOI system).
The persistence layer is therefore not the optimal solution for EmaP. The ultimate goal is to devise a sandboxed solution that contains all the content embedded in a single file. However, since any possible technical solution will have to be widely accepted in the scientific publishing landscape, there is no short-term (or even medium-term?) sandboxed solution in sight. The proposed solution is therefore the best one that can be deployed forthwith in the existing publishing landscape.
In the near future, the proposed technical solution could be greatly improved and endorsed if it were advocated by a social organisation. Such an organisation would provide standards and ideally host the geoservices, thus being a guarantor of persistence. Given the recent development of virtual research environments (VREs) such as D4Science (Assante et al. 2019), this seems a realistic development in the short term.