Sale Now on! Extra 5% off Sitewide

Fast Data Processing with Spark 2 - Third Edition

Packt Publishing
SKU:
9781785889271
|
ISBN13:
9781785889271
$50.44
(No reviews yet)
Condition:
New
Usually Ships in 24hrs
Current Stock:
Estimated Delivery by: | Fastest delivery by:
Adding to cart… The item has been added
Buy ebook
A fast, practical, and informative guide introducing you to Big Data and Data Science with Apache SparkAbout This Book* This practical tutorial will help developers get started with Spark and explore its main features and architecture* This single comprehensive resource offers an easy introduction to the framework, developing your knowledge quickly and pain-free* This book was published based on the latest version of Apache Spark, giving you the most up-to-date informationWho This Book Is ForThis book is for developers with little to no knowledge of Spark, but with a background in Scala/Java programming. It's recommended that you have experience in dealing and working with big data and a strong interest in data science.What You Will Learn* Install and set up Spark in your cluster* Prototype distributed applications with Spark's interactive shell* Perform data wrangling using the new DataFrame APIs* Get to know the different ways to interact with Spark's distributed representation of data (RDDs)* Query Spark with a SQL-like query syntax* See how Spark works with Big Data* Implement machine learning systems with highly scalable algorithms* Use R, the popular statistical language, to work with Spark* Apply interesting graph algorithms and graph processing with GraphXIn DetailSpark is an open source cluster computing system that is designed to process large datasets at high speed and with ease of development (one standard, rather than a combination of tools such as Hive, Pig, and MapReduce for Hadoop). Spark has some similarities to the Hadoop platform, but it provides the ability to load data in-memory and query repeatedly (using Spark SQL), making it much quicker than disk-based systems such as Hadoop.This book begins with how to download and set up Spark and then proceeds to explore Spark progressively from simple APIs to machine learning and graph processing. You will learn to use the Spark shell, load data, and perform machine learning. You will find out how to build and run your own Spark applications. We will show you how to manipulate your RDD and you'll get an understanding of various DataFrame APIs. The book will also cover the most common machine learning algorithms using Spark MLlib and programming Spark with the R language. Finally, we cover graph processing the GraphX APIs.


  • | Author: Krishna Sankar
  • | Publisher: Packt Publishing
  • | Publication Date: Oct 21, 2016
  • | Number of Pages: 274 pages
  • | Language: English
  • | Binding: Paperback
  • | ISBN-10: 1785889273
  • | ISBN-13: 9781785889271
Author:
Krishna Sankar
Publisher:
Packt Publishing
Publication Date:
Oct 21, 2016
Number of pages:
274 pages
Language:
English
Binding:
Paperback
ISBN-10:
1785889273
ISBN-13:
9781785889271