The relationship between science and literature in an innovative PCTO project on the theme of semantic analysis, told in a three-voice interview

Lucia the Great
Teacher of literary disciplines, Latin and Greek, Liceo Classico Mameli in Rome

Sarah Zuzzi
Mathematics and physics teacher, Mameli Classical High School in Rome

Mario Santoro
Researcher at IAC-CNR in Rome

Dear Professors and Dear Researcher,

thank you for your availability. We want to explore the theme of the project with you PCTO “Semantic analysis of texts with algorithms language and topic model" of the Goffredo Mameli Classical High School in Rome, in which Piazza Copernico is participating with its R&D sector. We want to understand with you not only what will be done in the project that has just started, but also the needs that have been identified in your school, and the reasons that led you to choose semantic analysis as the object of the PCTO.

To start, can you tell us what are the objectives that a PCTO must pursue?

The objective of a PCTO, according to the guidelines published by the Ministry of Education with the Decree of 4 September 2019 no. 774, is to promote training paths which direct students into the world of work and which, by offering them the possibility of interacting with expert professional figures, guide their interests, favoring the development of their skills and competences.

The high school student, engaged in the second two-year period and in the fifth year to choose one or more courses prepared by the school, thus has a dual task, that of learning from people who have skills in a particular working sector, acquiring new knowledge and to put into practice the teachings received, developing transversal skills.

If on the one hand, in fact, he is called to participate in frontal and/or laboratory lessons, to assist in activities related to specific work sectors and to get to know the places where the related professions take place, on the other hand he must demonstrate what he has learned translating his cognitive experience into one or more project activities assigned to him ongoing and/or at the end of the path (practical or theoretical) of which he is the protagonist and which will be the subject of discussion in the final exam. It is therefore clear that in a broader framework that identifies the aims of the experience, a path conceived in this way aims in general to strengthen the autonomy, motivation, awareness and safety of the student and contributes to forming him as a responsible citizen.

A recent possibility, which dates back to the Moratti reform art. 4L 53/2003, has also made it possible to organize these courses "under the responsibility of the educational institution or training, on the basis of agreements with companies or with the respective representative associations or with the chambers of commerce, industry, crafts and agriculture, or with public and private bodies including those of the third sector, available to welcome students for periods of internships which do not constitute an individual employment relationship".

Several schools - especially in recent years - have seized this opportunity as an important opportunity to increase the effectiveness of these courses and give them greater value: with the direct involvement of the educational institution it is in fact possible to closely follow the students and motivate them by promoting courses that have real relevance to the subjects of the curriculum and the topics of study addressed, building an integrated and synergistic proposal between school, the world of research and work. It is our case: it is the experience that we, the literature and mathematics teachers of the Mameli high school, have built together in collaboration with the CNR and the Piazza Copernico company.

We tell in a few but significant words the objective of your PCTO and the chosen theme.

Having consolidated over time many different types of semantic analysis, which are currently implemented in Semanticase, in 2021 we wanted to better investigate the logical structures between topics.

We were interested in getting to better understand the relationships between topics, not only in hierarchical terms, but in terms of relationships and connections. Being able to identify the topics that relate different thematic clusters or that centralize different topics, means having an interesting view on the fundamental and essential conceptual nuclei of a set of texts.

We considered this study essential to understand more and more the issues analyzed by semantic case.

In promoting the  PCTO Semantic analysis of texts with language and topic model algorithms,  Prof. Zuzzi and I have set ourselves a main goal: to show our third-E students how the human Sciences – in the specific case, Italian literature – e scientific subjects - in the specific case mathematics – can dialogue together and indeed, how one can be called upon to support the other in a research that tends in the same direction.

The research in question has as object of interest the study of passwords in their authenticity of significant, in their variety of meaning, within a system of relationships that condition them starting from different contexts (i.e. i testi belonging). Specifically, we have chosen to analyze the "words" of literature from the late nineteenth and early twentieth centuries, in particular the words used by Luigi Pirandello, Federigo Tozzi and Grazia Deledda.

Thus, students are called to get to know literary authors who they will deal with punctually during the fifth year of studies, and to work preliminarily on a traditional semantic analysis of selected texts (novels and short stories) taken as models to identify the main lines of poetics of the authors in question. Mathematics then comes into play: the textual data taken as a reference are processed by semantic algorithms using the graphic dashboard semantic case, born from a collaboration between the CNR and the research and development area of ​​Piazza Copernico.

I believe it is essential that students understand that outside of school, in the world of research and work, the division between disciplines is very blurred, scientific and humanistic skills integrate to build tools, ideas and knowledge. The idea of ​​carrying out a quantitative and graphic survey on the semantic analysis of texts with a PCTO, a method very distant from the school world, arose precisely because the concept of the text as data is an increasingly present tool in a transversal way in the world of research and work (in the management of company complaints, in the marketing, in linguistic research, etc.). I believe that young people should leave secondary school with knowledge, even if only basic, of some of the new technologies, knowing how to use and interpret them consciously and with a critical spirit.

You are teachers of literature and scientific subjects in a classical high school, and in the PCTO you deal with the topic of semantics and artificial intelligence. What is important for Generation Z kids to understand about the evolution of digital communication?

The idea is to focus on theongoing innovation in the analysis of digital texts. The enormous creation and exchange of text, which takes place every day from heterogeneous sources, has made it necessary to define new approaches for understanding the content, its specific semantics, detecting homogeneity/non-homogeneity for different groups. To this end, the concepts of are introduced in this path, without mathematical formalism language and topic model present in textual data analysis algorithms, placing the emphasis on understanding the results expressed through different graphical representations. Furthermore, it is essential to make the students understand that here too, as in any model, it is necessary to make choices based on the context being analysed; there is no simply push the button and get the analysis. These choices are made by scientific and literary experts who work in synergy with each other. To give an example, in the first meeting of the PCTO we analyzed a humorous story by Anton Chekhov (The deforming mirror) with Semanticase, deliberately setting an unsuitable choice in the algorithm, i.e. the calculation of Sentiment (positive/negative binary variable). This binary analysis option was not the most appropriate in the context of this story, because humor needs more nuances and a simple positive/negative Sentiment is not enough to define it. Changing the context and looking at the scope of company complaints, as the first photograph of the text, the choice of Sentiment is widely used to quickly understand whether customers speak positively or negatively about the service. So the synergy between experts to understand the context and which choice to make is the basis of any good analysis.

Digital analysis inevitably overturns the system of rules that we are used to following when we reflect on a text. At school we teach our pupils to carry out a linguistic analysis of a text only after reading, contextualizing and understanding it (decoded). So let's reflect on a well-known text and identify the keywords and semantic fields of reference, analyze the narratological categories and the stylistic choices that link the level of content and that of form; we investigate the relationships that exist between the elements within the text itself (intratextual analysis), which we then place in relation to other texts by the author or other authors (intertextual analysis). The analysis of linguistic and stylistic elements is essential for understanding the values ​​and reality that the author intends to represent in his work, therefore we cannot help but read, read, read if we want to truly understand. But when there are so many texts and we don't have the time and the possibility to read them all, mathematics can help us and suggest, starting from well-formulated requests, how to direct our research, which analysis tools to favor, which passages of the texts taken as reference to take into consideration with respect to the type of investigation that one intends to conduct, how to arrive at reflections of value as much as much form content. This is what we want our students to understand.

Digital skills at school, understanding of the meanings in texts, innovation and artificial intelligence, laboratory and cooperative activities, there are many ingredients of the project "Semantic analysis of texts with language and topic model algorithms”. What do you expect will be the main factor for the active involvement of the children? And what are the learning objectives to be grasped and what is the specific value of this experience?

Prof. Zuzzi and I have been working together for several years now. We met at Mameli and in this high school we built various interdisciplinary projects for our students in which mathematics, statistics, Italian literature, civic education and also Latin and Greek dialogued. We believe in a didactic that gives students an active role, which calls them to be aware of the cognitive process put in place and makes them protagonists of the learning path. In short, we believe in a didactic that stimulates to independently acquire skills, today certainly more numerous and complex, taking into account the complexity of our world of work. Thus we plan the activities only broadly and build our paths progressively, paying extreme attention to the needs that emerge from time to time from the mostly cooperative and laboratory work context and to the requests of individual students.

Winning for the active involvement of the boys is the use of digital/multimedia tools for text analysis, communication and dissemination of the method and results. Education today cannot ignore technology and the diffusion of devices: the use of digital tools and therefore of an experimental teaching that speaks the language of our students guarantees the success of a project and allows us to offer concrete examples of how digital tools can be used actively, intelligently and profitably.

For the traditional comparison and analysis of texts, for example, we have decided to privilege the activity of Social reading, a shared reading practice that allows the teacher to read a digital text together with the students, to comment on it, to analyze its keywords and semantic fields, according to the dynamics of interaction and communication typical of social networks.

Furthermore, to communicate the results of our research, the tools used are the multimedia ones of the blog, the  forum of discussion and of Web page.

The concepts and tools that have emerged in the digital and data science/machine learning/AI fields are today transversally applicable in ever new contexts: social and linguistic sciences, communication sciences and humanities, economics and management, represent effective tools for innovation and improvement of work. We therefore propose to guide our students towards getting used to these new techniques with a conscious use and with a critical spirit, deepening the underlying concepts and the various contexts of application.

The choice of a laboratory and cooperative context in the school environment is now necessary. In recent years, both psychological and neuroscientific research has made it possible to better understand that learning is above all a social experience, where interaction with others and teamwork are the priority elements. Group work is most effective with continuous peer interaction, as there is an effect full immersion.  I can cite some of the many books written by pedagogists and industry experts that focus on this concept: “In a group you learn. Cooperative and personalized learning of teaching processes” by Mario Martinelli (Ed. SEI School and Life series, 2004) e “I learn with the others” by Daniele Novara and Elena Passerini (Ed. Centro Studi Erickson, 2015).

The class involved in the project is a third year of the Liceo Classico. Why this choice?

The third class E is a class we have been working with for three years: they are enthusiastic young people, eager to learn, always ready to accept new challenges and get involved. We therefore had no hesitation in involving them in our project: we knew well that for Viola, Angela and Giorgia it would be an incentive to come out of their shells and fearlessly support their ideas; for Nicola, Giovanni, Claudio and Ludovica to encourage them to build critical thinking both at a literary and technological level; in general, an opportunity for everyone to understand that mathematics is useful for things that they don't even imagine.

The choice then to delve into the literature of the early twentieth century, with particular reference to the development of narrative in the form of short stories, short stories and novels, was not accidental. The Italian literature program is so vast that it is often not possible to deal exhaustively with topics which, precisely because of their proximity to us, appear truly current and spontaneously attract our students, stimulating them to valuable critical reflections. Starting from the third year, therefore, especially if you have the opportunity to work with motivated and committed students - and this is our case - it is good to propose, alongside the topics indicated by the ministerial programmes, training courses that stimulate them to reason, to self-reflection, to the elaboration of personal investigation tools, to face the reality that surrounds them with greater awareness. Discussions with colleagues from other disciplines such as history and philosophy are also welcome - which allow us to build a more articulated dialogue.

Furthermore, from a scientific point of view, the sense of carrying on this path in a third year of the classical high school in which the students have been dealing with functions for a year and have started studying physics a few months ago, lies in introducing the concept of model to represent a different phenomenon from those traditionally described by mathematics, in which the text is a given and thus come into contact with innovative topics of the world of research and work, to begin to have greater awareness in university choice.

This project provides for the interesting involvement of a research organization such as a CNR institute in Rome. How this experience can contribute to enriching the experience of a classical high school. And how does this initiative contribute to the planning of the Institute on Digital Curvature?

I have already expressed myself extensively on the importance of making the world of research and school dialogue. As regards the Curvatura digitale training proposal in the classical high school, which our school wanted to introduce for the next school year, the PCTO of Semantic analysis of texts it will be a module implemented in the third year: this path is only a pilot project.

The way we produce and store texts has changed dramatically in recent years. A very small space such as that of a hard disk can today contain thousands of texts which, up until a few decades ago, could only be contained in a library. We are now all used to having a library in the palm of our hand without problems. Conversely, our way of approaching texts hasn't changed very much: we process them 'one by one' using the computer only to quickly search for words we know as a glossary. This approach, which we have had culturally for centuries, is one of a kind bottom-up: I start from the single text, I summarize it, I compare the summaries. In a world that instead produces enormous quantities of texts, it is becoming increasingly important to change the paradigm with a type approach top-down. I examine a corpus of texts through the aid of algorithms that carry out a synthesis bias, I deepen the topics going to read oh hoc some texts or parts of them (chosen on the basis of the analysis made above), I use the synthesis obtained to make a comparison and ultimately investigate the individual texts. It emerges that this type of approach, keeping intact the literary exegesis tools typical of the classical high school, projects the male and female students into the future, in a world in which literary analysis fully exploits technological resources by governing them, and not being crushed or taking refuge in ravines of conservatism as an end in itself.

What are the innovation topics that will be covered during the PCTO and why do you think they are relevant topics in the preparation of students also in the humanities courses?

We deal in a technical and non-detailed way with different points of innovation in the humanistic field. Let's start in order:

  • The use of these tools allows us to think about our approach to the text and its interpretation. Reflection on which model is best to use and on the expected results leads to a discussion on the superstructures inevitably present when analyzing texts. This reflection represents an innovation in the approach to textual data and their analysis especially in the literary field.
  • It is practice in the exegesis of texts to start with an approach bottom-up: text by text is analyzed to then look for a thread that unites the different texts. Some semantic algorithms, such as the topic modeling, instead have a is top-down: we start from the synthesis of corpus object of study and then decline it in the various texts. This is also an innovation that allows you to have less bias related to subjective choices.
  • Semantic algorithms allow to verify the similarity between sentences; this induces, having identified some representative sentences of a text, to compare them with the sentences of other texts by the same author or by other authors. This is a totally innovative kind of approach.
What role will Piazza Copernico play in the project? And how can the experience of a company actually be useful for young people in understanding how to direct their studies?

Our central idea in this PCTO is interaction and dialogue. L'interaction, as we have repeatedly repeated, is between scientific and literary subjects, among young people with cooperative and laboratory work, but it is also between school, the world of research and work. The idea is to make our students understand that behind every technological tool there is a model and a study where various "worlds" (disciplines) talk to each other to obtain a product and applicability.

PIAZZA COPERNICO has a triple role:

  • make the product available semantic case;
  • to understand how the semantic analysis of texts is used in a business environment;
  • focus attention on a fundamental concept, communication, giving the students the tools to build a final project dissemination product.