LSOP - Live analyses in sports with Odysseus P2P
LSOP - Live analyses in sports with Odysseus P2P
LSOP - Live analyses in sport with Odysseus P2P
The aim of this project group is to calculate and provide sports statistics for an ongoing game in real time on the basis of distributed continuous processing. This is based on sensor data that is transmitted during a game (e.g. position sensors in the players' shoes). Coaches and trainers should be able to access the statistics in real time in order to draw conclusions (e.g. optimal running routes). Due to the number of sensors potentially used, very large amounts of data can be generated. Furthermore, the sensors provide their data autonomously (as so-called active data sources), i.e. they cannot be controlled by the processed system. This results in so-called data streams that cannot be stored completely. As the data must also be processed promptly, a data stream management system (DSMS) is to be used in the project group. This should receive the data, process it immediately and output the results. The way in which the data is to be processed is regulated via requests. As a high data rate and volume is to be expected, a distributed solution is still being sought. Therefore, the core of the project is to process the data streams in a distributed manner (see figure).
The basis for this is a peer-to-peer (P2P) network in which each peer executes its own DSMS. Each peer acts autonomously and can enter and leave the network at any time. The DSMS used is Odysseus, an (extendable) framework for constructing DSMSs. It contains all the basic functions and components for data stream processing as well as the first steps towards peer computing. This means that the project group does not have to deal with basic concepts such as query languages, translation and (local) execution, but can concentrate on the distributed concepts. The following subtasks are to be realised:
- Connection of the sensors as data sources
- Designing a suitable GUI for displaying sports statistics in real time
- Decomposition and distribution of continuous requests that calculate the statistics based on the sensor data
- Load balancing: "shifting" requests from peer to peer to equalise loads
- Replication: Multiple execution of a request to compensate for network failures
- Recovery: Simple reintegration of previously failed peers
- Parallelisation: Decomposition of a data stream so that it can be processed by several peers (and subsequent merging of the results)
- Expandability: It should be possible to integrate new/improved components at runtime without having to deactivate the network.
- Query sharing: (partial) results of queries should be able to be reused in other queries
The subtasks should be implemented as additional components in Odysseus. This also means that the solutions to be developed can be used as universally as possible. Various data sources can be used at the end of the project:
- During development, a sample data set is offered in order to exclude the technical aspects of sensor technology. The data set contains the recording of a football match, in which the position data of all players is recorded during the entire course of the match. This makes it easy to evaluate the project group's initial results.
- Later on, real sensors will be connected to transmit their data to the processing system in real time. The project group will first test the functionality itself so that the developed product can then be tested under "real" conditions: The processing of a basketball game in real time. Initial exploratory work is currently being carried out to involve EWE Baskets in the project.
- Alternative sources should also be able to be connected in order to demonstrate the flexibility of the solution. A set of data from wind turbines is therefore being offered. Data from the field of astronomy will also be made available.
The result should be a flexible Odysseus that can be distributed in a P2P network and that can use multiple peers robustly and reliably to process data sources. The sports analysis use case is intended to show that efficient distributed data stream processing is indeed possible. Questions, criticism and comments to Timo Michelsen (timo.michelsen at uni-oldenburg.de) Room O66 (at OFFIS) Phone (office): 0441 / 9722-141
