Hosted Science: Managing Computational Workflows in the Cloud

Hosted Science: Managing Computational Workflows in the Cloud

Ewa Deelman, Gideon Juve, Maciej Malawski, Jarek Nabrzyski

Parallel Processing Letters, 232 (02) 1340004. World Scientific Publishing Co. https://doi.org/10.1142/S0129626413400045

Scientists today are exploring the use of new tools and computing platforms to do their science. They are using workflow management tools to describe and manage complex applications and are evaluating the features and performance of clouds to see if they meet their computational needs. Although today, hosting is limited to providing virtual resources and simple services, one can imagine that in the future entire scientific analyses will be hosted for the user. The latter would specify the desired analysis, the timeframe of the computation, and the available budget. Hosted services would then deliver the desired results within the provided constraints. This paper describes current work on managing scientific applications on the cloud, focusing on workflow management and related data management issues. Frequently, applications are not represented by single workflows but rather as sets of related workflowsworkflow ensembles. Thus, hosted services need to be able to manage entire workflow ensembles, evaluating tradeoffs between completing as many high-value ensemble members as possible and delivering results within a certain time and budget. This paper gives an overview of existing hosted science issues, presents the current state of the art on resource provisioning that can support it, as well as outlines future research directions in this field.