Program

Collect & Connect: Archives and Collections in a Digital Age

====> Register here & listen to presentations and keynotes for free! <====

The proceedings of the conference are published in
open access as volume 2810 of the CEUR workshop series, see here

Below you find the program of the conference Collect&Connect. For more information about the registration click here. The conference will be managed with the service NetworkTables in combination with ZOOM. Networktables allows you to claim your seat for specific sessions of the conference. Moreover, NetworkTables is a great tool to get in touch with colleagues in the field. So please fill in your profile, when your register for a NetworkTables account, see instructions here.

Finally three more general remarks:

Since the virtual conference is managed from Leiden, the Netherlands, all timeslots refer to CET (=Central European Time). For a useful tool to calculate the difference between different time zones click here.
Participating in the conference is free of charge. However, please be aware that each session has a maximum number of participants, so please make sure that you do not wait too long until you claim your seat for the sessions you would like to attend! In order to claim a seat please click on the links mentioned between =====> .... <====== in the session descriptions below.
If you tweet about the conference, use the hastag #COLCO2020.

23 November 2020:

======> CLAIM YOUR SEAT FOR THE OPENING SESSION HERE! <=======

11:00 – 12:30: Opening & keynote

Words of welcome
Prof. Jaap van den Herik (PI/Leiden University)
Dr. Christiane Klöditz (NWO)
Stefan Einarson (Brill publishers, Leiden)
Peter Schalk (Naturalis)
Keynote lecture on From Event to Data Set: Perspective, structure, and the problem of representation in data-driven digital history by Sharon Leon (Michigan State University). (see abstract below ⬇ )
Session chair: Andreas Weber (U Twente)

12:30 – 13:00: Lunch break

13:00 – 14:00: Paper presentations I

======> CLAIM YOUR SEAT FOR THIS SESSION HERE! <======

S. Ellis, Emergent archives and crowdsourced narratives: two de-velopment stories from the Queensland State Library
L. Zuckerman, Linked Data and Holocaust-Era Art Markets: Gaps and Dysfunctions in the Knowledge Supply Chain
B. Hendriks, P. Groth and M. van Erp, Recognising and Linking Entities in Old DutchText: A Case Study on VOC Notary Records

Session chair: Katy Wolstencroft (U Leiden)

14:00 – 14:30: Coffee break

14:30 – 15:30: Paper presentations II

======> CLAIM YOUR SEAT FOR THIS SESSION HERE! <======

M. Koolen, R. Hoekstra, I. Nijenhuis, R. Sluijter, R. van Koert, E. van Gelder, G. Brouwer and H. Brugman, Modelling Resolutions of the Dutch States General for Digital Historical Research
L. Viola and A. M. Fiscarelli, From digitised sources to digital data: Behind the scenes of (critically) enriching a digital heritage collection
B. Bogacz, S. Finlayson, D. Panagiotopolous and H. Mara, Visualizing and Contextualizing Outliers in Aegean Seal Collections

Session chair: Maarten Heerlien (Rijksmuseum Amsterdam)

15:30 – 15:45: Coffee break

15:45 – 17:00: Round table discussion on Semantics and Beyond (see abstract below ⬇)

====> CLAIM YOUR SEAT FOR THIS ROUND TABLE HERE! <======

Semantics and Beyond: Modeling and enriching longue-durée biocultural data for answering interdisciplinary and epistemic research questions convened

Panel convener: Martha Fleming (Natural History Museum of Denmark)
Roundtable chairs: Dominik Hünniger (Hamburg University) and Katy Wolstencroft (Leiden University)
Panelists: Sally Chambers (Ghent University), Isabelle Charmantier (Linnean Society), Tahani Nadim (Museum fuer Naturkunde Berlin & Humboldt-University), Nicky Nicolson (Royal Botanical Gardens Kew), Neil Safier (John Carter Brown Library & Brown University)

=====> END OF CONFERENCE DAY <=====

24 November 2020:

======> CLAIM YOUR SEAT FOR THIS KEYNOTE LECTURE HERE! <======

11:30 - 12:30: Keynote lecture on The Data Challenge for Cultural and Natural Heritage by Franco Niccolucci (PIN - University of Florence) (see abstract below ⬇)

Session chair: Marco de Niet (Leiden University, University Library)

12:30 – 13:00: Lunch break

13:00 – 14:00: Paper presentations III

======> CLAIM YOUR SEAT FOR THIS SESION HERE! <======

C. Rinaldo, D. Castronovo, J. Deveer and D. Rielinger, Supporting Natural History Science by Connecting Collections
M. Ameryan and L. Schomaker, A high-performance Word Recognition System for the Cultural Heritage of the Natuurkundige Commissie
G. Barabucci, F. Tomasi and F. Vitali, Supporting complexity and conjectures in cultural heritage descriptions

Session chair: Eulàlia Gassó Miracle (Naturalis Leiden)

14:00 – 14:30: Coffee break

14:30 – 15:30: Demo lab

======> CLAIM YOUR SEAT FOR THE DEMO LAB HERE! <======

Natuurkundige Commissie Online (NCO Online), Brill publishers (E. Suyver & E. Posthumus)
Transcribathon, Europeana / Facts & Files (F. Drauschke)
Lombroso project (C. Cilli)
Photography in Tuscany: stories of a cultural heritage (F. Strobino & L. Santos)

Session chair: Katy Wolstencroft (U Leiden)

15:30 – 15:45: Coffee break

15:45 – 17:15: Keynote lecture & Closing of the conference

======> CLAIM YOUR SEAT FOR THE CLOSING SESSION HERE! <======

Keynote lecture on From Pixels to Knowledge Using AI: Where do the humans fit in? by Lambert Schomaker (University of Groningen) (see abstract below ⬇ )
Closing of the conference
Session chairs: Jaap van den Herik & Andreas Weber

Keynote lecture by Sharon Leon (Michigan State University)

From Event to Data Set: Perspective, Structure, and the Problem of Representation in Data-Driven Digital History

Digital historians are well-familiar with thenotion that the larger community of historians generally has been skeptical of and cautious about data-driven scholarship. In the wake of the widespread reaction against cliometrics, historians generally have been private about their work with data—presenting only end products, narratives, and summaries, even when that work is data-driven, but not all that computationally sophisticated. Often a small part of a much larger interpretive process, many who do minor work with data never even note that they have a set of spreadsheets or a database that they used to organize and analyze their source materials. This tendency has worked to mask the role that data collection and analysis play in contemporary historical scholarship, and to undermine the potential that resides in the aggregation and computational engagement with that data. Taking data seriously as capta, as Johanna Drucker suggests, highlights their constructedness as source materials and places them in an important trajectory in the lifecycle of historical evidence. That trajectory includes the initial creation of the record, its elevation to the status of a piece of information that should be preserved, its preservation, its preparation for research access, its review by an historian, its transformation into structured data, and its publication in a digitally accessible form. In scrutinizing this lifecycle, historians can come to a renewed awareness of the constructed nature of data, and of the individuals who help to shape access to evidence about the past, including record creators, archivists, historians, and technologist. Using records related to the history of enslavement as a case study, this talk will explore the lifecycle of historian-created data and the ways that our insights about the past might be enriched by applying computational techniques.

Keynote lecture by Franco Niccolucci (PIN - University of Florence)

The data challenge for cultural and natural heritage

The challenge of managing cultural and natural heritage data depends on their ever-increasing amount vis-à-vis the limited researchers’ time to access, check and possibly use them. Their intrinsic nature of being heterogenous, fragmentary and interpretive makes such data difficult to find, examine and fully re-use with off-the-shelf tools. The lecture will analyse the requirements of a comprehensive approach, discussing which technology is available, which one needs further (but doable) work and which one appears as a promising solution but hides traps that can make it totally ineffective. The focus will be on semantics and on machine-actionable services, in order to guarantee a reliable and efficient data re-use as envisaged by the FAIR data principles.

Biography:

Franco Niccolucci is the director of VAST-LAB research laboratory at PIN in Prato, Italy. A former professor at the University of Florence until 2008, he has directed the Science and Technology in Archaeology Research Center at the Cyprus Institute, Nicosia, until 2013. Prof Niccolucci has coordinated several EU-funded projects on the applications of Information Technology to Archaeology, and is currently the coordinator of ARIADNEplus, a research infrastructure on archaeological data. His main research interests concern knowledge organization of archaeological documentation and the communication of cultural heritage. He is currently the Editor-in-Chief of JOCCH, the ACM Journal of Computing and Cultural Heritage. He has authored about 100 papers and book chapters.

Keynote lecture by Lambert Schomaker (University of Groningen)

From pixels to knowledge using AI: Where do the humans fit in?

During the last decade, tremendous advances have been made in artificial intelligence, more precisely, within the neural-networks branch of machine learning. However, most of the impressive examples are in self-defined problems from within academia, such as playing video games, playing the game of Go, classifying objects on a standardized, segmented and preprocessed image set, mimicking a painter's style, etc. Each of the successes then leads to a performance rat race within such artificial applications. In the real world of industry and scholarly research it appears to be much harder to quickly replicate the impressive results. This is especially true for the area of historical manuscript analysis. Very good results have been obtained, as will be shown in this presentation. However, it is also becoming increasingly clear that the role of humans is still essential, in all stages, from preprocessing and segmentation, up to the text recognition itself. The good news is that the remaining points of friction allow us to sketch out an agenda for AI research in the coming decade.

Biography:

Lambert Schomaker is Full Professor of Computer Science and Artificial Intelligence at the University of Groningen. He received his M.Sc. degree in psychophysiological psychology in 1983 (cum laude), and his Ph.D. degree on "Simulation and Recognition of Handwriting Movements" in 1991 at Nijmegen University, The Netherlands. Since 1988, he has been working in several European Esprit projects concerning the recognition of on-line, connected cursive script on the basis of knowledge on the handwriting movement process. He was the project coordinator of a large European project on multimodality in multimedial interfaces (project MIAMI), and has enjoyed collaborative research projects with several industrial companies. Current projects are in the area of image-based retrieval, on-line and off-line handwriting recognition, forensic writer identification, and cognitive robot navigation models. Apart from research, his duties involve teaching courses in artificial intelligence and pattern classification. His work on neural networks for handwriting and gesture recognition is a precursor to modern handwriting and gesture-recognition methods on tablet computers such as the iPad. He is currently active in a 30 MEuro multidisciplinary research-valorisation project ('Target') in mass-storage, high-performance computing and datamining, in order to implement the MONK generic search engine for handwritten historical archives. The MONK system is unique, world wide, due to its huge scale, genericity and its use of live ('24/7') machine learning. In 2012, the Monk system contained 130 million image of handwritten words, from over twenty historical collections, and 22 thousand trained word classes, continuously refined by new dedicated recognition and retrieval schemes using dedicated shape features for handwriting and user labeling. The availability of thousands of example images for single classes of complex patterns has brought pattern recognition and machine learning into a new ballpark. More information about Prof. Schomaker can be found here: https://www.rug.nl/staff/l.r.b.schomaker/cv

Round table discussion convened by Martha Fleming (Natural History Museum of Denmark)

Semantics and Beyond: Modeling and enriching longue-durée biocultural data for answering interdisciplinary and epistemic research questions

Natural history museum collections are complex assemblages of biota, minerals, rocks and ores removed from their habitats and then embedded into museological, archival, documentary and classificatory information ecosystems that span hundreds of years — from manuscript notebooks to present-day genomic databases. In this regard, they are ‘biocultural’ collections: of high significance to all humanity, and research objects for both biology and the humanities. Entering a ‘collection,’ specimens are then transformed in multiple, technical and increasingly standardised ways: analysed, arrayed, redistributed, mounted for display, atomised, frozen, scanned at high resolution, sequenced, and of course catalogued. Then these analyses, fragments, scans, genomic sequences and catalogues are aggregated and publicly shared — often nowadays in digital form, and on a global scale.

All these activities are knowledge producing processes with epistemic value and consequences. Present-day database catalogues of these collections in no way represent the full scale of data that has been recorded about these materials, and of course do not contain significant bodies of often local and/or indigenous knowledge that were not recorded at the time of collection. Considerable collaborative efforts on the part of humanities, digital humanities, and information science scholars working as librarians, archivists and curators are incrementally improving both the enrichment of digitally accessible metadata and the connectivity between different forms of data to improve this situation. And their work is surfacing a raft of epistemic questions with deep historical roots.

Enriching data ‘from below’ is critical to understanding habitat and climate change, yet large-scale online (often open access) biological databases such as GBIF, BOLD, GGI, JSTOR Plants and others still present data derived from earlier incomplete natural historical catalogues. To a significant degree, the research question ‘what counts as information, and why?’ that has recently illuminated much early modern natural history practice is also applicable to the highly sophisticated information architectures of present day natural history. What provisions are being created at scale to accommodate fresh, valuable and potentially paradigmatic knowledge derived from archival research and made newly interoperable through semantic alignment? This exploratory panel will begin to more clearly articulate and map a number of these epistemic questions and showcase some interdisciplinary answers and solutions. It is hoped that the findings will contribute to, and be extended by, working groups being initiated by major digital infrastructure initiatives such as Europeana, Biodiversity Heritage Library and DiSCCo in both the sciences and the humanities across Europe and beyond.

Round-table chairs:

Martha Fleming (Natural History Museum of Denmark)
Katy Wolstencroft (Leiden University)

Panelists:

Sally Chambers

Sally Chambers is Digital Humanities Research Coordinator at the Ghent Centre for Digital Humanities. She has also been invited by DiSSCo to advise and inform the developing consortium on the needs and contributions of humanities researchers in relation to large-scale biological data.

Isabelle Charmantier

Isabelle Charmantier is Head of Collections at the Linnean Society, a significant historical collection that includes a herbarium, natural history specimens, illustrations, notebooks, and catalogues (both manuscript and digital). Her post-doctoral work uncovered the critically important paper information architectures in the work of Linneaus.

Tahani Nadim

Tahani Nadim is a junior professor for socio-cultural anthropology (joint appointment: Museum für Naturkunde Berlin and Humboldt-University's Institute for European Ethnology). Her interdisciplinary research project "Data Natures" problematizes data practices and data infrastructures in biodiversity discovery and natural history collections.

Nicky Nicolson

Nicky Nicolson is Senior Research Leader in Biodiversity Informatics at the Royal Botanic Gardens, Kew. She is effecting data mining research on digitised herbarium specimen collections mobilised via the GBIF network and scientific name publication data from the International Plant Names Index to identify distributed 'duplicates' and identify historical collection events and collector trajectories.

Neil Safier

Neil Safier is the Director and Librarian of the John Carter Brown Library and Associate Professor in the Department of History at Brown University. He is an environmental historian specialising in Central and South American colonial contexts. The John Carter Brown Library is the preeminent research collection in the world for the study of the Americas prior to 1825.