Decentralized training of deep learning models is a key element for enabling data privacy and on-device learning over networks. In realistic learning scenarios, the presence of heterogeneity across different clients' local datasets poses an optimization ch ...
Data cleaning has become an indispensable part of data analysis due to the increasing amount of dirty data. Data scientists spend most of their time preparing dirty data before it can be used for data analysis. Existing solutions that attempt to automate t ...
An important prerequisite for developing trustworthy artificial intelligence is high quality data. Crowdsourcing has emerged as a popular method of data collection in the past few years. However, there is always a concern about the quality of the data thus ...
Information-intensive transformation is vital to realize the Industry 4.0 paradigm, where processes, systems, and people are in a connected environment. Current factories must combine different sources of knowledge with different technological layers. Taki ...
Digital Twins (DT) are proposed in industries to support the entire lifecycle of services with different perspectives. Lack of systematic analysis of DT concepts leads to various definitions and services which challenges the DT developers for data integrat ...
Data are essential to urban building energy models and yet, obtaining sufficient and accurate building data at a large-scale is challenging. Previous studies have highlighted that the data impact on urban case studies has not been sufficiently discussed. T ...
The ever-growing number of edge devices (e.g., smartphones) and the exploding volume of sensitive data they produce, call for distributed machine learning techniques that are privacy-preserving. Given the increasing computing capabilities of modern edge de ...