Machine Learning Solutions for Segmented and Fully Distributed Computing Environments

Many emerging Internet of Things applications are built on uploading all the gathered data to a distributed cloud infrastructure, where complex data mining and machine learning solutions can be applied. However, in many cases data are confidential, and cannot be uploaded to a remote cloud. Also, building global data models might not capture all the local specificities and locally relevant data patterns. What now?

This thesis aims to devise distributed machine learning solutions that combine locally generated data with information acquired from open data sources. The developed methods are being tested in use-case scenarios related to precisions farming. This industrial doctorate programme, to be carried out in Budapest, will also include a six-month stay at the Eindhoven University of Technology, in the Netherlands.


We are experiencing today an unprecedented growth of IoT-applications, in all fields of our everyday life, generating huge amounts of data that must be analysed and mined for complex data patterns and long-term correlations in space and time. Such operations are however very resource-hungry, distributed cloud infrastructures should thus be employed. Nevertheless, because of privacy concerns, the gathered data should often be anonymised, which might introduce a certain noise into the data, affecting its further analytics. Also, local models should complement the generic patterns that might be derived from globally gathered data. Precision farming is a good use-case test bed for this research, since each farm described by region-specific conditions requires privacy-preserving, real-time data processing and data analytics on a local level, but can in the same time utilize globally valid patterns for the given sector.


The intended solution is based on combining meta-learning and fully distributed machine learning approaches. In meta-learning, data are represented by attributes, called meta-features, capturing the main characteristics of data. These meta-features and the local models might be utilized to build a global model by preserving privacy at some extent. Learning local models might be done in segmented/segregated clouds, while the global model is built on a central cloud. For building such a computational infrastructure, approaches and methods developed in edge computing will be considered. The proposed hybrid approach, lying on the boundary of meta-learning, fully distributed machine learning and edge computing areas is very innovative, and its development will represent a deployable solution for different domains such as precision farming or smart cities, just to name a few.

Expected outcome

The goal of the thesis is the research and development of novel Machine Learning and Data Mining techniques and their integration into a fully distributed, privacy-preserving, meta machine learning framework for segmented cloud infrastructures. These techniques will then be ready to be applied in different application scenarios.


The doctoral student involved in this programme will share its time between the Co-Location Centre of the EIT Digital Budapest Node, the premises of Magyar Telekom, and the Eötvös Loránd University. A six-month mobility at the Eindhoven University of Technology will be also part of the program.


  • Industrial partner: Magyar Telekom
  • Academic/research partner: Eötvös Loránd University
  • Number of available PhD positions: 1
  • Duration: 4 years
  • This PhD will be funded by EIT Digital, Eötvös Loránd University and Magyar Telekom


Those interested in applying should send an e-mail to, including a CV, a motivation letter, and documents showing their academic track records.

Please apply before November 1, 2017.

© 2010-2019 EIT Digital IVZW. All rights reserved. Legal notice