The HyGraph Research Project

Graphs are simple yet highly expressive data structures for modeling and analyzing relationships between real-world objects. As the structure and content of graphs is continuously changing, e.g. in social networks or transport and mobility networks, novel data models and analysis mechanisms are needed. Our goal is to develop HyGraph, a new hybrid data model that seamlessly combines temporal graphs with time-series and enables high-frequency updates through graph streams. This combination in a unified hybrid model paves the way to novel unprecedented query, analysis, data mining and machine learning tasks.

Hybrid Data Model

Shape

The HyGraph model combines temporal graphs and time series with real-time graph streams into a single abstraction.

LEARN MORE

Query Operators

Shape

Hybrid operators will be developed to utilize the new data model and all substructures, allowing a wide range of options for combining OLTP and OLAP approaches to create new types of analytics.

LEARN MORE

Intelligent Analytics

Shape

With HyGraph we will develop new types of analytics and data mining operators that benefit from the model combination. It further offers new ways of predictions through AI-based models.

LEARN MORE

Development and Application

Shape

The new data model, operators and analysis approaches will prototypically be implemented, evaluated and applied to use cases from the micro-mobility and IT security domain.

LEARN MORE

Hybrid Data Model

The HyGraph model combines temporal graphs and time series into a single abstraction. Real-time graph streams are used to continuously update the hybrid graph. The model includes data representation, maintenance operations, updatability, and retrieval and querying of substructures while ensuring consistency with inference or integrity rules.

One of our tasks is to design a flexible data model that can handle dynamic relationships and time-series data. Two potential designs have already been proposed, and suitable types and properties need to be defined to encode evolving input data. We will support maintaining and updating the hybrid graph data model as new data arrives, and the focus is on developing maintenance algorithms for update, insert, and delete operations. The plan is to use the properties of the hybrid graph to group maintenance operations as bulk operations and to enforce consistency using mechanisms similar to key constraints for property graphs.

Model Combination

Query Operators

Operators will be developed to utilize the new data model and all substructures, allowing a wide range of options for combining OLTP and OLAP approaches to create new types of analytics. The hybrid operator has access to the temporal graph, modeled time series, and incoming changes from the graph stream.

We focus on designing and implementing query operators for hybrid graph databases that can efficiently retrieve both structured and unstructured data. The goal is to address the complexity and heterogeneity of data stored in hybrid graphs. Two tasks have been identified: reviewing existing query languages and operators for graph databases and designing query operators that can handle the complexity of hybrid graph data. The resulting query operators should retrieve both entity-centric and relationship-centric information, as well as hybrid information that combines both entities and relationships.

HyGraph Operators

Intelligent Analytics

With HyGraph we aim to provide a variety of analysis and mining possibilities for the data represented by the hybrid data model, including classical statistical analysis and machine learning methods. Two tasks have been identified: the first is to enable analytical pipelines that combine basic operators with algorithms for temporal graphs and time series, allowing for the creation of complex pipelines for advanced analytics tasks. The second task involves developing two mining or learning approaches, namely Hybrid Graph Clustering and Frequent Hybrid-Graph Mining, based on previous research on clustering and frequent subgraph mining, as well as adopting machine learning such as Graph Neural Networks to leverage the structural information contained in the graph.

HyGraph Operator Combination

Development and Application

The new data model, operators and analysis approaches will prototypically be implemented, evaluated and applied to at least one use case either from the micro-mobility domain (bike sharing) or from the IT security domain.

We will design and realize the overall system architecture based on the HyGraph model and addressing challenges such as handling large data volumes and ensuring performance and scalability. We will apply it to use cases such as analyzing bike rental data or modeling system calls to identify IT system intrusions. The goal is to demonstrate the effectiveness of HyGraph in solving real-world problems and optimizing its design and implementation.

Use Case

Meet The Team

Team

Prof. Dr. Angela Bonifati

Lyon - Principal Investigator
Team

Dr. Remy Cazabet

Lyon - Dynamic Network Mining
Team

Dr. Riccardo Tommasini

Lyon - Graph Streams
Team

Dr. Shubhangi Agarwal

Lyon - Subgraph matching
Team

Prof. Dr. Erhard Rahm

Leipzig - Principal Investigator
Team

Dr. Eric Peukert

Leipzig - Data Mining
Team

Christopher Rost

Leipzig - Temporal Graphs
Team

Mouna Ammar

Leipzig - PhD student

Related publications

Evolution of Degree Metrics in Large Temporal Graphs

2023

Christopher Rost, Kevin Gómez, Peter Christen, Erhard Rahm, et al.

PG-Schema: Schemas for Property Graphs

2023

Renzo Angles, Angela Bonifati, Stefania Dumbrava, George Fletcher, Alastair Green, et al.

Distributed temporal graph analytics with GRADOOP

2022

Christopher Rost, Kevin Gómez, Matthias Täschner, Philip Fritzsche, Lucas Schons, et al.

Seraph: Continuous Queries on Property Graph Streams

2024

Christopher Rost, Riccardo Tommasini, Angela Bonifati, Valle E. D., Erhard Rahm, et al.

Time2Feat: Learning Interpretable Representations forMultivariate Time Series Clustering

2022

Angela Bonifati, Francesco Del Buono, Francesco Guerra, Donato Tiano


Acknowledgement

This project is funded by the German Research Foundation (DFG) under grant number RA 497/25-1 and the French National Research Agency (ANR) under grant number ANR-22-CE92-0025-01.