Dynamic profiling and feedback framework for reduce-side join

Makoto Nakayama, Kenichi Yamazaki, Satoshi Tanaka, Hironori Kasahara

研究成果: Paper

抜粋

MapReduce has become popular and Reduce-side join is one of the most important application of MapReduce. Data skew, in which the data load assigned to each Reduce task fluctuates task by task, increases the MapReduce job completion time. This paper proposes a dynamic profiling and feedback framework that works on a MapReduce cluster. The framework allows programmers to build their own algorithm to address data skew on Reduce-side join based on their specific knowledge and/or requirements. This paper also proposes an estimation method which makes our framework adapt to a wide range of MapReduce cluster sizes. This paper presents two example algorithms to address data skew using the estimation method, and the experimental results shows up to 2.59 times speed-up of join completion time on a cluster with 50 servers and highly skewed input data.

元の言語English
ページ1255-1262
ページ数8
DOI
出版物ステータスPublished - 2013 12 1
イベント2013 16th IEEE International Conference on Computational Science and Engineering, CSE 2013 - Sydney, NSW, Australia
継続期間: 2013 12 32013 12 5

Conference

Conference2013 16th IEEE International Conference on Computational Science and Engineering, CSE 2013
Australia
Sydney, NSW
期間13/12/313/12/5

ASJC Scopus subject areas

  • Computer Science (miscellaneous)

フィンガープリント Dynamic profiling and feedback framework for reduce-side join' の研究トピックを掘り下げます。これらはともに一意のフィンガープリントを構成します。

  • これを引用

    Nakayama, M., Yamazaki, K., Tanaka, S., & Kasahara, H. (2013). Dynamic profiling and feedback framework for reduce-side join. 1255-1262. 論文発表場所 2013 16th IEEE International Conference on Computational Science and Engineering, CSE 2013, Sydney, NSW, Australia. https://doi.org/10.1109/CSE.2013.187