# Giulia Preti, PhD

That quite definitely is the answer. I think the problem, to be quite honest with you, is that you’ve never actually known what the question is.

— “The Hitchhiker's Guide to the Galaxy” by Douglas Adams

## All hail!

I am Giulia Preti.

I am a Post Doc in the area of Learning and Algorithms for Data Analytics at ISI Foundation, Turin (Italy), under the coordination of Francesco Bonchi.

I was a member of the DbTrento research group, and I worked on techniques for mining relevant structures in dynamic and heterogeneous datasets.

I got my PhD in Information and Communication Technology at the University of Trento (Italy), under the supervision of Prof. Yannis Velegrakis.

I was also the teaching assistant for the Computability and Computational Complexity course from 2015 to 2018.

I got my master's degree in Computer Science and my bachelor's degree in Mathematics, both of which pursued at the University of Trento.

## Research Interests

My research focuses on graphs, a versatile data model that has been increasingly used to represent a large plethora of data, from biology to social networks, and from computer networks to smart cities. In particular, I consider weighted graphs and dynamic graphs.

Weighted graphs are graphs whose nodes and edges are labeled with weights indicating their relevance or quality. Moreover, in applications aiming at offering personalized products and services to each individual user rather than ``one size fits all'' solutions, each element of the graph naturally carries multiple weights, one for each user.

My goal is to identify structures that appear frequently in the graph and whose appearances are characterized by large weights, and hence are relevant for the user, under the assumption that larger weights indicate higher interest.

Dynamic graphs are graphs that change over time, meaning that their nodes and edges can undergo both structural and attribute changes. They are generally modeled as sequences of static graphs called snapshots. In this context, I am interested in detecting groups of edges that evolve in a convergent manner, meaning that they display a positive correlation on their behavior. These groups of correlated edges, especially when they involve edges that are topologically close, can represent regions of interest in the network.

During my PhD studies, I also worked on entity resolution in highly heterogeneous and temporal databases, defined as collections of records characterized by different schemata and timestamps indicating the date of creation. The reconciliation of the records in this kind of situation, requires specialized similarity functions that take into consideration both the heterogeneity and the dynamism of the data. In my work, I proposed a suitable time-aware schema-agnostic similarity measure and a framework that uses this measure to identify maximal groups of similar temporal records.

### MaNIACS

MaNIACS is a sampling-based randomized algorithm for computing approximations of the collection of the subgraph patterns that are frequent in a single vertex-labeled graph, according to the Minimum Node Image-based (MNI) frequency measure. The output of MaNIACS comes with strong probabilistic guarantees. The quality of the approximation is obtained using the empirical Vapnik-Chervonenkis (VC) dimension, a key concept from statistical learning theory. In particular, given a failure probability, a frequency threshold, and a sample size, with at least such probability over the choice of the sample of such size, the output of MaNIACS contains each pattern of size k with relative MNI frequency greater than the threshold and with estimated frequency within epsilon from the relative MNI frequency.

MaNIACS leverages properties of the frequency function to aggressively prune the pattern search space, and thus to reduce the time spent in exploring subspaces containing no frequent patterns. The framework includes both an exact and an approximate mining algorithm.

## Experiences

September 2015 – September 2018

### Teaching Assistant

I was the teaching assistant for the Computability and Computational Complexity course, at the University of Trento, Italy.

May 2018 - September 2018

### Visiting Researcher

Thanks to the SoBigData TNA program, I had the chance to visit the Aalto University in Espoo, Finland, and work with prof. Aristides Gionis on event detection in dynamic networks.

July 2017 - September 2017

### Visiting Researcher

I visited the Laboratoire de Recherche en Informatique (LRI) in Orsay, France, and worked with the LaHDAK team, lead by Prof. Nathalie Pernelle, on a project about entity matching in heterogeneous datasets.

2019 - ongoing

### Reviewer

I served as external reviewer and subreviewer for several conferences and journals: WSDM2022, CIKM2021, ECML-PKDD2021, MIDAS2021, SEAdata2021, SDM2021, TKDD2022, TKDE2021, WWW2021 ("Best of the Best" Reviewer Award) , ECML-PKDD2020, ICDM2020, IJCAI2020, SEAdata2020, TKDD2020, VLDB2020, FGCS2019.

## Publications

KDD, 2021

### MaNIACS: Approximate Mining of Frequent Patterns through Sampling

Giulia Preti, Gianmarco De Francisci Morales, Matteo Riondato

PAKDD, 2021

### Discovering Dense Correlated Subgraphs in Dynamic Networks.

Giulia Preti, Polina Rozenshtein, Aristides Gionis, Yannis Velegrakis

The Web Conference, 2021

### STruD: Truss Decomposition of Simplicial Complexes.

Giulia Preti, Gianmarco De Francisci Morales, Francesco Bonchi

ECML-PKDD, 2020

### Mining Dense Subgraphs with Similar Edges.

Polina Rozenshtein, Giulia Preti, Aristides Gionis, Yannis Velegrakis

ICDM Workshops, 2019

### ExCoDE: a Tool for Discovering and Visualizing Regions of Correlation in Dynamic Networks.

Giulia Preti, Polina Rozenshtein, Aristides Gionis, Yannis Velegrakis

Distributed and Parallel Databases Journal, 2019

### Mining Patterns in Graphs with Multiple Weights.

Giulia Preti, Matteo Lissandrini, Davide Mottin, Yannis Velegrakis

EDBT, 2018

### Beyond Frequencies: Graph Pattern Mining in Multi-weighted Graphs.

Giulia Preti, Matteo Lissandrini, Davide Mottin, Yannis Velegrakis