Speaker "Peter Nghiem" Details Back
-
Name
Peter Nghiem
-
Company
Santa Clara University
-
Designation
Author
Topic
Best Trade-off Point Algorithm for Efficient Resource Provisioning in Hadoop
Abstract
The fast-growing cloud ecosystem is predicted to have up to 25 billion IOT sensor devices connected by 2020. This large number of IOT devices will generate hundreds of zettabytes of information in the cloud to be analyzed by Big Data processing engines such as Hadoop MapReduce and Spark, and other analytics platforms to deliver practical value in business, technology, and manufacturing processes for better innovations and more intelligent decisions. In such an era of exponential growth in Big Data, energy efficiency has become an important issue for the ubiquitous Hadoop MapReduce framework. However, the question of what is the optimal number of tasks required for a job to get the most efficient performance from Hadoop MapReduce still has no definite answer.
In this talk, I will present my patent pending Best Trade-off Point method and algorithm with mathematical formulas for obtaining the exact optimal number of task resources for any workload running on Hadoop. This breakthrough approach for determining the Best Trade-off Point between performance and resources shows that the currently well-known rules of thumb for calculating the required number of reduce tasks for a job are inaccurate and could lead to significant waste of computing resources and energy. This method could be applied to any target system with an elbow curve such as any computing system, network data routing system, payload engine/rocket system, lean manufacturing system, cost vs. quality control system, and any elbow yield curve system including yield curve for various types of securities, convex isoquant curve, semiconductor/IC yield curve, and economics yield curve.