Tag: Hadoop

Big Data Analytics with Hadoop and Apache Spark (2024)


Free Download Big Data Analytics with Hadoop and Apache Spark (2024)
Released 10/2024
MP4 | Video: h264, 1280×720 | Audio: AAC, 44.1 KHz, 2 Ch
Skill Level: Intermediate | Genre: eLearning | Language: English + srt | Duration: 51m | Size: 119 MB
Apache Hadoop was a pioneer in the world of big data technologies, and it continues to lead in enterprise big data storage. Apache Spark is the top big data processing engine and provides an impressive array of features and capabilities. When used together, the Hadoop Distributed File System (HDFS) and Spark can provide a truly scalable setup for big data analytics. In this course, data analytics expert Kumaran Ponnambalam shows you how to leverage these two technologies to build scalable and optimized data analytics pipelines. Explore ways to optimize data modeling and storage on HDFS; discuss scalable data ingestion and extraction using Spark; and review actionable tips for optimizing data processing in Spark. Plus, complete a use case project that allows you to practice your new techniques.

(more…)

Pro Hadoop


Free Download Pro Hadoop By Jason Venner
2009 | 440 Pages | ISBN: 1430219424 | PDF | 8 MB
You’ve heard the hype about Hadoop: it runs petabyte – scale data mining tasks insanely fast, it runs gigantic tasks on clouds for absurdly cheap, it’s been heavily committed to by tech giants like IBM, Yahoo!, and the Apache Project, and it’s completely open-source. But what exactly is it, and more importantly, how do you even get a Hadoop cluster up and running? From Apress, the name you’ve come to trust for hands-on technical knowledge, Pro Hadoop brings you up to speed on Hadoop. You learn the ins and outs of MapReduce; how to structure a cluster, design, and implement the Hadoop file system; and how to build your first cloud-computing tasks using Hadoop. Learn how to let Hadoop take care of distributing and parallelizing your software – you just focus on the code, Hadoop takes care of the rest.

(more…)

Programming Pig Dataflow Scripting with Hadoop


Free Download Programming Pig: Dataflow Scripting with Hadoop By Alan Gates
2011 | 224 Pages | ISBN: 1449302645 | PDF | 8 MB
This guide is an ideal learning tool and reference for Apache Pig, the open source engine for executing parallel data flows on Hadoop. With Pig, you can batch-process data without having to create a full-fledged application – making it easy for you to experiment with new datasets. Programming Pig introduces new users to Pig, and provides experienced users with comprehensive coverage on key features such as the Pig Latin scripting language, the Grunt shell, and User Defined Functions (UDFs) for extending Pig. If you need to analyze terabytes of data, this book shows you how to do it efficiently with Pig.

(more…)

Apache Hadoop 3 Quick Start Guide


Free Download Apache Hadoop 3 Quick Start Guide: Learn about big data processing and analytics
English | 2018 | ISBN: 9781788999830 | 222 Pages | PDF EPUB MOBI (True) | 18 MB
Apache Hadoop is a widely used distributed data platform. It enables large datasets to be efficiently processed instead of using one large computer to store and process the data. This book will get you started with the Hadoop ecosystem, and introduce you to the main technical topics, including MapReduce, YARN, and HDFS.

(more…)

Big Data and Hadoop – 2nd Edition


Free Download Big Data and Hadoop: Fundamentals, tools, and techniques for data-driven success – 2nd Edition
English | 2024 | ISBN: 9355516665 | 769 Pages | EPUB (True) | 28 MB
Start with the fundamentals of big data, exploring its growing significance and diverse applications. You’ll look into the heart of the Apache Hadoop ecosystem, mastering its core components like HDFS and MapReduce. We’ll demystify NoSQL databases, introducing you to HBase and Cassandra as powerful alternatives to traditional databases.

(more…)