This online course will introduce you to various high-performance computing (HPC) facilities for big data analysis. This includes R – a programming language renowned for its simplicity, elegance and community support – and Hadoop – an open source, Java-based programming framework for large datasets. You will find out how to use them, avoiding common pitfalls and saving you time and money.
What topics will you cover?
- First steps in R and RStudio
- Working with Apache Hadoop 1 – Fundamentals
- Working with Apache Hadoop 2 – RHadoop
- Statistical learning using RHadoop
What will you achieve?
By the end of the course, you will:
- Understand how the performance of modern supercomputing is achieved
- Understand the basic functionality of the Bash terminal window
- Understand the basic functionality of Apache Hadoop for scalable, distributed computing
- Understand the basic functionality of RHadoop
- Understand the basic problems of supervised and unsupervised learning
- Perform basic clustering, regression, and classification with RHadoop.
This school offers programs in: