[hpc-announce] Special Issue on Parallel and Distributed Data Mining

Massimo Cafaro massimo.cafaro at unisalento.it
Wed Oct 18 09:22:41 CDT 2017

Special Issue on Parallel and Distributed Data Mining
Information Sciences, Elsevier

The sheer volume of new data, which is being generated at an increasingly fast pace, has already produced an anticipated data deluge that is difficult to challenge. We are in the presence of an overwhelming vast quantity of data, owing to how easy is to produce or derive digital data. Even the storage of this massive amount of data is becoming a highly demanding task, outpacing the current development of hardware and software infrastructure. Nonetheless, this effort must be undertaken now for the preservation, organization and long-term maintenance of these precious data. However, the collected data is useless without our ability fully understand and make use of it. Therefore, we need new algorithms to address this challenge.

Data mining techniques and algorithms to process huge amount of data in order to extract useful and interesting information have become popular in many different contexts. Algorithms are required to make sense of data automatically and in efficient ways. Nonetheless, even though sequential computer systems performance is improving, they are not suitable to keep up with the increase in the demand for data mining applications and the data size. Moreover, the main memory of sequential systems may not be enough to hold all the data related to current applications.

This Special Issue takes into account the increasing interest in the design and implementation of parallel and distributed data mining algorithms. Parallel algorithms can easily address both the running time and memory requirement issues, by exploiting the vast aggregate main memory and processing power of processors and accelerators available on parallel computers. Anyway, parallelizing existing algorithms in order to achieve good performance and scalability with regard to massive datasets is not trivial. Indeed, it is of paramount importance a good data organization and decomposition strategy in order to balance the workload while minimizing data dependences. Another concern is related to minimizing synchronization and communication overhead. Finally, I/O costs should be minimized as well. Creating breakthrough parallel algorithms for high-performance data mining applications requires addressing several key computing problems which may lead to novel solutions and new insights in interdisciplinary applications.

Moreover, increasingly the data is spread among different geographically distributed sites. Centralized processing of this data is very inefficient and expensive. In some cases, it may even be impractical and subject to security risks. Therefore, processing the data minimizing the amount of data being exchanged whilst guaranteeing at the same time correctness and efficiency is an extremely important challenge. Distributed data mining performs data analysis and mining in a fundamentally distributed manner paying careful attention to resource constraints, in particular bandwidth limitation, privacy concerns and computing power.

The focus of this Special Issue is on all forms of advances in high-performance and distributed data mining algorithms and applications. The topics relevant to the Special Issue include (but are not limited to) the following.


Scalable parallel data mining algorithms using message-passing, shared-memory or hybrid programming paradigms

Exploiting modern parallel architectures including FPGA, GPU and many-core accelerators for parallel data mining applications

Middleware for high-performance data mining on grid and cloud environments

Benchmarking and performance studies of high-performance data mining applications

Novel programming paradigms to support high-performance computing for data mining

Performance models for high-performance data mining applications and middleware

Programming models, tools, and environments for high-performance computing in data mining

Map-reduce based parallel data mining algorithms

Caching, streaming, pipelining, and other optimization techniques for data management in high-performance computing for data mining

Novel distributed data mining algorithms


All manuscripts and any supplementary material should be submitted electronically through Elsevier Editorial System (EES) at http://ees.elsevier.com/ins (http://ees.elsevier.com/ins). The authors must select as “SI:PDDM” when they reach the “Article Type” step in the submission process.

A detailed submission guideline is available as “Guide to Authors” at: http://www.elsevier.com/journals/information-sciences/0020-0255/guide-for-authors.


Submission deadline: December 1th, 2017
First round notification: March 1th, 2018
Revised version due: May 1st, 2018
Final notification: June 1st, 2018
Camera-ready due: July 1st, 2018
Publication tentative date: October 2018

Guest editors:

Massimo Cafaro, Email: massimo.cafaro at unisalento.it
University of Salento, Italy and Euro-Mediterranean Centre on Climate Change, Foundation

Italo Epicoco, Email: italo.epicoco at unisalento.it
University of Salento, Italy and Euro-Mediterranean Centre on Climate Change, Foundation

Marco Pulimeno, Email: marco.pulimeno at unisalento.it
University of Salento, Italy



 Massimo Cafaro, Ph.D.
 Associate Professor                                                            
 Dept. of Engineering for Innovation            
 University of Salento, Lecce, Italy               
 Via per Monteroni                                                   
 73100 Lecce, Italy					                            
 Voice/Fax  +39 0832 297371 				               
 Web  http://sara.unisalento.it/~cafaro                                                                 
 E-mail massimo.cafaro at unisalento.it
 cafaro at ieee.org
 cafaro at acm.org

 CMCC Foundation
 Euro-Mediterranean Center on Climate Change
 Via Augusto Imperatore, 16 - 73100 Lecce
 massimo.cafaro at cmcc.it


-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2155 bytes
Desc: not available
URL: <https://lists.mcs.anl.gov/mailman/private/hpc-announce/attachments/20171018/d276ae40/attachment-0001.p7s>

More information about the hpc-announce mailing list