January 16 to 18 2015, Santa Clara, USA.

Big Data Bootcamp

Big Data Track

Day -1 (January 16 7:30AM-8:00PM )

7:30 AM - 8:00 AM Registration
8:00 AM - 9:00AM Big Data : Succeeding with Polyglot Persistence(Vladimir Bacvansk, Founder, SciSpike)
9:00 AM - 10:20AM Introduction to Hadoop 2.0
10:20AM -10:30AM Break
10:30 AM - 12:00PM HDFS Deep Dive, HDFS Lab
12:00 PM - 12:30PM Lunch Break
12:30PM - 3:00PM MapReduce Deep Dive, MapReduce Lab
3:00PM - 4:00PM Common pitfalls in developing MapReduce applications and how to avoid them( Daniel Templeton, Cloudera)
4:00PM - 4:10PM Break
4:10PM - 5:00PM Hadoop: Amazon EC2 and CDH Setup & Hive Workshop
5:00PM - 6:00PM From 0 to Streaming: Spark and Cassandra( Russell Spitzer, Datastax)
6:00PM - 8:00PM Online experimentation hypothesis testing big data analysis(Zhenyu Zhao, Yahoo)

Day -2 (January 17 8AM-8:00PM )

8:00 AM - 8:15 AM Introduction to NOSQL
8:15 AM - 12:00 PM HBase workshop
12:00 PM - 12:30PM Lunch Break
12:30 PM - 2:30PM Use Case: Convert NYSE raw data into dashboard using Hadoop eco system with no programming using SQL/Hive Query & Tableau
2:30 PM - 3:30 PM Hadoop Operations at Rocket Fuel(Kishore Yellamraju, Rocket Fuel )
3:30PM - 3:40PM Break
3:40 PM - 4:30 PM Apache Ambari features to manage cluster, plus architecture deep-dive
Alejandro Fernandez(Hortonworks)
4:30 PM - 5:30PM Breaking Into Big Data 
5:30 PM - 6:00 PM Big Data Security : Security from Big data & Security for Big Data
6:00 PM - 6:10 PM Break
6:10PM-8:00PM Hadoop Deployment Challenges

Day -3 (January 18 8AM-8:00PM )

8:00AM-9:00AM Big Data and Cyber Security - A case study for DDoS(Dr. Satyam Priyadarshy, Halliburton)
9:00 AM - 12:00 PM Data Wrangling With R - An etude : Workshop( Krishna Sankar, Blackarrow.tv)
12:00 PM - 1:00PM Lunch Break
1:00 PM - 1:30 PM Big Data Trends and Startups (Sanjit Singh, Investment Director, Intel )
1:30PM - 6:00PM Optimization & Performance Tuning in Hadoop (Workshop)
* Interpret Counters
* Understanding Performance tuning criteria
* Split Size
* Number of Reducers
- Addressing skew
- Custom partitioners
* Combiners
* Distributed Cache
* Hive Map Side Joins
* Compression
* Combine File Input Format
* Filtering
* Sorting (Global Sorting and key based sorting)
– Order by vs. Sort by
* Hive Partitioning
* JVM Reuse
6:00 PM - 8:00 PM Use Case: Apache Spark after Dark(Chris Fregly, Author of Effective Spark & Databricks)

Big Data Track II

 January 17 10AM-8:00PM

10:00 AM - 10:15 AM Introduction to NOSQL
10:15 AM - 12:00 PM MongoDB Workshop
12:00 PM - 12:30PM Lunch Break
12:30 PM - 4:30 PM Cassandra Workshop
4:30 PM - 5:30PM Big Data Security : Security from Big data & Security for Big Data
5:30 PM - 6:00 PM Breaking Into Big Data
6:00 PM - 6:10 PM Break
6:10PM-8:00PM Hadoop Deployment Challenges

January 18 9AM-6:00PM  Big Data Track II (Big Data Analytics with Spark(Full day workshop))

 

9:00 AM - 12:30 PM Spark Core
Spark SQL
Spark Streaming
Machine Learning and MLlib
Scala
Advanced Spark
12:30 PM - 1:00PM Lunch Break
1:00 PM - 1:30 PM Big Data Trends and Startups (Sanjit Singh, Investment Director, Intel )
1:30 PM - 6:00 PM Scala Lab
Spark Core Lab
Spark SQL Lab
MLib Lab Spark Streaming lab

NOTE: Agenda and speakers subject to change without notice

x

Get latest updates of Big Data Bootcamp
sent to your inbox.

Weekly insight from industry insiders.
Plus exclusive content and offers.