Dante in e-book format: Anna Raimo interviews Michelangelo Zaccarello

Author of essays and volumes published in Italy and abroad, Michelangelo Zaccarello is Full Professor of Italian Philology at the University of Pisa, after several years of teaching between Dublin, Oxford and Verona. He is the author of several critical editions and a handbook of Italian Philology (L’edizione critica del testo letterario, Milano, Mondadori, 2017). More recent writings apply the same theoretical and methodological perspective to the advent of collaborative authorship and digital textuality: Teoria e forme del testo digitale (Roma, Carocci, 2019) and Leggere senza libri. Conoscere gli e-book di letteratura italiana (Firenze, Cesati, 2020).

First, dear Professor Zaccarello, thank you for agreeing to be interviewed by Insula Europea and congratulations on your new book: Reading without books. Getting to know the e-books of Italian literature. In the preface of this volume, you immediately state the aims of the book: “Through a review of the most recent studies on these complex issues, this book addresses the most topical questions of digital textuality […] Among the main aims of the volume is a contribution to the wider awareness of the profound changes that electronic textuality has brought to the long-tested rituals of reading and study” («Attraverso una rassegna dei più recenti studi su tali complesse questioni, questo libro affronta le questioni più attuali della testualità digitale […] Fra i principali propositi del volume c’è un contributo alla più ampia presa di coscienza dei profondi mutamenti che la testualità elettronica ha impresso ai collaudati rituali della lettura e dello studio»). What other motivations prompted you to tackle this issue?

Before dealing with digital textuality, which has been the focus of my interest for no more than ten years, I am concerned with philology, that is, the discipline that studies the ways in which texts are produced, published and circulated, with the aim both of producing scientifically conducted editions and of studying historically the various forms of their reception and use. All these aspects have been profoundly transformed over the last decades by the advent of digital technologies: these have quickly gone from conditioning the use of literary texts (increasingly “researched” and not read in a linear and sequential manner) to profoundly transforming the method of data collection and interpretation, with ever greater attention given to research on corpora of considerable size and not always internally homogeneous, putting the breadth of the base interrogated before the “fineness” of the encoding. The result is a new centrality of search interfaces, which are now part of everyday life but must be used consciously, in order not to distort the very meaning of the search: conditioned by keywords and user profiling, in fact, we increasingly risk finding what we were looking for, following preconceived lines that ranking algorithms are designed to identify and support. If this is the way in which fake news and conspiracy theories make their way, the consequences for scientific research can be very serious.

You have been a pioneer of Digital Humanities applied to the study of philology; how did your interest in this topic arise? Also, when you started your studies, did you expect such a rapid development of these new technologies?

There are actually many other scholars who deserve the title of pioneer of DH in Italy: Pasquale Stoppelli, Domenico Fiormonte and Dino Buzzetti, to name only three names in three distinct areas of this vast category (digital archives, conservation and study of variant material and theoretical reflections on digital textuality). What is certain is that it is a fascinating field precisely because of the profound mutations that – driven by rapid technological progress – can occur even in a few years. This is particularly evident in tools that quickly become part of our daily life, such as search engines. The latter run ranking algorithms that favour the most visited, or better linked, pages in ways that to a large extent disregard the validity and reliability of their contents. In other terms, pages are assessed only for their relevance to the search terms, and their integration into the Web’s mutual link system.

In the first chapter, you mention the limited diffusion of books in digital format, explaining the reasons and mentioning, among others, the different taxation compared to paper format. Do you think that governments can implement ad hoc policies to support digital publishing?

An important step forward was the equalization of VAT on e-books with the reduced rate of 4% for paper books. Not everyone knows that, until a few years ago, the taxation of such products was aligned with that of digital media such as CDs and DVDs: 22% instead of the 4% reserved for books, a regime that made it very difficult for publishers to offer a real advantage to readers who wanted to embrace the new format. In Italy, this obviously fair decision was taken in 2015, but endorsed by Ecofin only in October two years ago. For the rest, the progress of digital formats in the publishing market can only move from an increase in the demand for reading, but the latter will eventually have to overcome a certain attachment to the reassuring tactile relationship with the pages, to the richness of the traditional reading experience, to the greater memorability of the paper page (as the book recounts, these aspects were already highlighted in 1996 by an unsuspected voice, that of Bill Gates!).

In your book, you often refer to the lack of attention paid by users to the quality of the content they download, referring to the classics of Italian literature; do you think this is true of all students or are humanists, on average, more attentive to sources?

I teach a subject in which most texts are not covered by copyright. Students have been taking advantage of this possibility for a long time and, it must be said at once, there is no reason to criticize this habit, as long as you do not rely on it too much. I used to refer to it myself when preparing my lessons, but it was by checking paper sources that I began to suspect its reliability and to investigate how the scans from which our e-books are derived are acquired. Already the procedure is not absolutely accurate, but one must bear in mind that – in order to avoid claims from modern publishers – sources scanned are almost always old editions, with yellowed paper, faded and/or unusually shaped characters (as are, in relation to today’s computer fonts, almost all books printed by hand, i.e., up to the 1980s), may pose difficulties for one or more stages of the procedure, with significant repercussions on the accuracy of “reading”. The humanist reader should be more sensitive to the textual quality of what he reads, but it must also be said that times and modes of reading have changed in the digital context, favouring a more hurried and fragmented use of literary text.

In the fifth chapter, he dwells on the digital editions of the Vita Nova, showing a few errors; are the other digitized Dantean works, including the Commedia, also philologically inaccurate?

At first glance, the Commedia does not seem to share the criticalities of the minor works: the research is still in progress and it does not seem fair to anticipate the results. In any case, there is no doubt that the intense and continuous practice of commentary has led to the emergence of discrepancies that have gradually been corrected. Compared to the Vita Nuova, mostly digitized from the venerable text set by Michele Barbi in 1932, the sources of the Commedia are much more recent, and create fewer problems for the electronic eye. Compared to today’s computer fonts, almost all books printed by hand, i.e., up to the 1980s, with the inevitable signs of time (yellowed paper, faded and/or unusually shaped characters), cannot help but pose difficulties at one or more stages of the scanning procedure, with significant repercussions on the accuracy of “reading”.

How do you think the quality of digitisation could be improved: would more human intervention be enough?

Since the 1990s, there has been a veritable race to digitise literary works, especially those not subject to copyright, i.e., those of authors who have been dead for at least seventy years: for Italian literature, this now includes not only Svevo, D’Annunzio and Pirandello, but in a few days also Cesare Pavese. In that crucial phase of mass digitization, which only came to an end after the first decade of the 2000s, the race was on to digitise as much material as possible, to realise the – admittedly laudable – dream of a “universal” library, which would make world literature available to everyone free of charge. In this way, several tens of millions of titles have been digitised – especially thanks to the Google Books project (there is no agreement on the real figure), but with a low-cost approach that has inevitably set aside quality in favour of the so-called “good-enough”, a category that is very popular in today’s world (where the Internet is always and in any case the place of freedom and gratuitousness) but creates quite a few problems for the scientific study of literary works. With this amount of material available, downloadable and sharable online, it is practically impossible to systematically improve its quality: it is much better to act on projects that develop users’ awareness, for instance by encouraging them to at least partially check the paper source when they have to quote a passage.

Can you tell us more about the projects you are working on?

Due to its statutory purposes, the Bologna-based “Commissione per i Testi di Lingua” – founded in 1860 with the noble “aim of finding and disseminating, through publication, the works of Italian writers of the fourteenth and fifteenth centuries” – has a major role to play in promoting such awareness in accessing Italian Classics in digital format. Well, a couple of years ago, a project of Permanent Observatory on scholarly publishing practices and on the authoritativeness of the edition of Italian literary texts in the digital context (OPEdIt) was created within the Commission. A consortium of Italian and foreign universities will constitute the operational space of the Observatory, to carry out checks on the texts available on the web of works from the first centuries of Italian literature, verify their reliability and promote possible revisions. With particular reference to Dante, Petrarca and Boccaccio (but with the prospect of involving the Italian Classics up to the nineteenth century), to safeguard the integrity and correctness of the works of Italian Literature from the pitfalls of digitisation and online dissemination, promoting sample checks (e.g. by means of degree theses) and/or research groups that assess both the fidelity of the e-texts to their respective paper sources and the general validity – formal and substantial – of the text they contain.

In your book, you mainly refer to medieval works; do you think that if more recent authors were analysed, such as Giacomo Leopardi, Alessandro Manzoni or even a poetess who was very well known in 19th-century Naples, Maria Giuseppa Guacci, the rendering of the texts would be better?

Unfortunately, we are just taking the first steps, and these are long and difficult verifications: think that of the Vita Nuova alone (or Vita Nova, according to Gorni’s edition) there are no less than seven digital versions, based on different sources and in turn replicated in various ways! The book reflects a state of advancement of such verifications that does not go beyond a few paradigmatic texts of our two-thirds century. In any case, it is difficult to suppose that the problems related to the paper source, the digitisation process and the post-production algorithms are very different from what we have found with ancient authors. Paola Italia dedicated very important pages to describe the deplorable situation in which e-books of the greatest novel of our literature find themselves: not even the title of Manzoni’s masterpiece is well reflected, variously quoted with or without articles, with or without capital letters (Paola Italia concludes: “Six combinations for a single title, which is instead, according to the title page of the two editions: “I Promessi Sposi”). E-books’ linguistic aspects may prove particularly deceptive, since character recognition algorithms employ dictionaries based on current usage, and archaic or regional forms of the ancient language are a source of misunderstandings and trivialisation.

What do you think of “Biblioteca Italiana”, a project thanks to which many works have been digitised? In your opinion, which digital libraries are currently the most valuable for studying the classics?

The Italian Library of the CIBIT- “Centro Interuniversitario per la Biblioteca Italiana Telematica” consortium, coordinated by the University of Rome “La Sapienza”, is a virtuous example of how large digitisation projects can be supported in collaboration with university staff, and consequently developed with the necessary attention to quality parameters. Born as a digital library, the site has recently expanded into “a portal that brings together a series of useful tools for in-depth study and research” (from the site), confirming the inseparable link that binds the production of high-quality textual resources to the environments that are professionally devoted to teaching and research on those works. In its “democratisation”, in fact, the world of the Internet has revolutionised the practices of reading the Classics, but above all has jeopardised the necessary process of validation of related critical knowledge. To make this point clearer: in paper publishing, extrinsic parameters such as aesthetics, ease of retrieval, cover price, etc. have the merit of launching the text into a circuit of fruition that consolidated an implicit hierarchy of values. Whilst expensive critical editions are bought by libraries and academic departments, paperback editions satisfied a general reading demand, and between these two polarities there are various intermediate solutions that intercepted equally specific reading demands. With digitised texts floating on the web in a precarious state of accuracy and lacking any sort of introductory materials and annotation (publishers can claim rights to commentaries or such!), the digital context risks discouraging rather than encouraging reading.

Do you think that a greater interconnection of digital libraries, not only in Italy, could improve their quality and above all their usability for the user?

There have been decades in which there has been a race between parallel digitisation initiatives competing to load as many texts on their platforms as possible, without paying much attention to their quality and without enabling users to use these databases simultaneously. In computer science, this is known as the “silos” effect: a set of fixed data is created that can only be accessed from within the resource, without any protocols in place to make it communicate and exchange with other similar resources. If in the 1990s this could already be a good result, since the new millennium an increasing number of projects have been devoted to developing the interoperability of initiatives, promoting the mutual exchange of information and data. According to the standards of the Semantic Web, set out since 2003 but still little exploited in the humanities, new languages of description and annotation of resources can allow users – through specific automatic query protocols – to carry out transversal searches on various resources, limiting the ‘background noise’ we are all too familiar with from the current keyword search system.

Concluding our chat, I am curious to know if you are working on any digitisation projects today?

I am collaborating on various digital projects, but none of them are digitisation projects (a form which, as I said, has already experienced its period of maximum impetus). Among them, I would like to mention a very new project, “Archivi Letterari Digitali Nativi” (ALDiNa ), set up by Emmanuela Carbè and Tiziana Mancinelli to collect detailed information and ensure adequate documentation and enhancement of the digital documentation of contemporary authors (i.e. their drafts and variants preserved, as has been the case for decades, on floppy disks, CD-ROMs, hard drives, etc.). This is an issue that is still little addressed on the Italian scene, but – in view of the rapid obsolescence of media and programs, and the loss of hardware material caused by rapid technological progress – it is already urgent to promote awareness of the problem, good practices of preservation, access and study of this heritage. Indeed, there is no doubt that, apart from a few virtuous but isolated projects such as PAD  “Pavia Archivi Digitali”, there is a remarkable lack of protocols for archiving and preserving this kind of data and media. If we take into account that, more or less, we can consider born digital (i.e. we can assume that they have always worked on digital media) authors who are now in their fifties, it is clear that a large part of this heritage has already been lost among the many landfills around the world that dispose of and recycle digital waste (e-waste).


Anna Raimo
