Tag: Apache

Learning Path – Getting Started with Apache Spark

Free Download Learning Path – Getting Started with Apache Spark

Last updated: 11/2024
MP4 | Video: h264, 1280×720 | Audio: AAC, 44.1 KHz, 2 Ch
Genre: eLearning | Language: English + subtitle | Duration: 417 Lessons ( 23h 56m ) | Size: 3.47 GB
Apache Spark is an open-source, unified analytics engine for large-scale data processing. It provides comprehensive APIs in Python, Java, Scala, and R. Spark is designed for speed and efficiency, excelling in complex analytics across big data sets. It supports batch processing, real-time streaming, machine learning, and graph processing tasks. This learning path is intended to help data professionals get started working with Apache Spark.

(more…)

In-Memory Analytics with Apache Arrow, 2nd Edition


Free Download In-Memory Analytics with Apache Arrow: Accelerate data analytics for efficient processing of flat and hierarchical data structures, 2nd Edition by Matthew Topol
English | September 30th, 2024 | ISBN: 1835461220 | 406 pages | True PDF | 17.30 MB
Harness the power of Apache Arrow to optimize tabular data processing and develop robust, high-performance data systems with its standardized, language-independent columnar memory format

(more…)

In-Memory Analytics with Apache Arrow, 2nd Edition [Repost]


Free Download In-Memory Analytics with Apache Arrow: Accelerate data analytics for efficient processing of flat and hierarchical data structures, 2nd Edition by Matthew Topol
English | September 30, 2024 | ISBN: 1835461220 | True EPUB | 406 pages | 9.1 MB
Harness the power of Apache Arrow to optimize tabular data processing and develop robust, high-performance data systems with its standardized, language-independent columnar memory format

(more…)

Apache Spark for Machine Learning Build and deploy high-performance big data AI solutions for large-scale clusters


Free Download Apache Spark for Machine Learning: Build and deploy high-performance big data AI solutions for large-scale clusters by Deepak Gowda
English | November 1st, 2024 | ISBN: 1804618160 | 306 pages | True PDF | 11.00 MB
Develop your data science skills with Apache Spark to solve real-world problems for Fortune 500 companies using scalable algorithms on large cloud computing clusters

(more…)

Big Data Analytics with Hadoop and Apache Spark (2024)


Free Download Big Data Analytics with Hadoop and Apache Spark (2024)
Released 10/2024
MP4 | Video: h264, 1280×720 | Audio: AAC, 44.1 KHz, 2 Ch
Skill Level: Intermediate | Genre: eLearning | Language: English + srt | Duration: 51m | Size: 119 MB
Apache Hadoop was a pioneer in the world of big data technologies, and it continues to lead in enterprise big data storage. Apache Spark is the top big data processing engine and provides an impressive array of features and capabilities. When used together, the Hadoop Distributed File System (HDFS) and Spark can provide a truly scalable setup for big data analytics. In this course, data analytics expert Kumaran Ponnambalam shows you how to leverage these two technologies to build scalable and optimized data analytics pipelines. Explore ways to optimize data modeling and storage on HDFS; discuss scalable data ingestion and extraction using Spark; and review actionable tips for optimizing data processing in Spark. Plus, complete a use case project that allows you to practice your new techniques.

(more…)

Databricks Certified Associate Developer for Apache Spark Using Python


Free Download Databricks Certified Associate Developer for Apache Spark Using Python: The ultimate guide to getting certified in Apache Spark using practical examples with Python by Saba Shah
English | June 14, 2024 | ISBN: 1804619787 | 274 pages | PDF | 7.11 Mb
Learn the concepts and exercises needed to get certified as a Databricks Associate Developer for Apache Spark 3.0 and validate your skills as a Spark expert with an industry-recognized credential

(more…)

Apache Spark Essential Training – Big Data Engineering (2024)


Free Download Apache Spark Essential Training – Big Data Engineering (2024)
Released: 10/2024
Duration: 1h 4m | .MP4 1280×720, 30 fps(r) | AAC, 48000 Hz, 2ch | 146 MB
Level: Intermediate | Genre: eLearning | Language: English
Data engineering is the foundation for building analytics and data science applications in the new Big Data world. Data engineering requires combining multiple big data technologies to construct data pipelines and networks to stream, process, and store data. This course focuses on building full-fledged solutions that combine Apache Spark with other big data tools to create end-to-end data pipelines. Instructor Kumaran Ponnambalam begins by defining data engineering, its functions, and its concepts. Next, Kumaran goes over how Spark capabilities such as parallel processing, execution plans, state management options, and machine learning work with extract, transform, load (ETL). He introduces you to batch processing use cases and processes, as well as real-time processing pipelines. After taking you through several useful best practices, Kumaran concludes with an end-to-end exercise project.

(more…)