Playlist "Swiss Python Summit 2024"

Demystifying Spark: A Deep Dive into Its Workings

Apache Spark is a powerful framework often used alongside Python for big data processing. You've seen its capabilities, but what powers its impressive performance? In this session, we'll delve into the internal workings of Spark. We'll explore concepts like Resilient Distributed Datasets (RDDs), which are fundamental to Spark's fault tolerance. We'll see how Spark distributes tasks across a cluster, leveraging Python's strengths in parallel processing. Finally, we'll uncover the secrets of in-memory computations, the key to Spark's blazing speed. Gaining a deeper understanding of Spark's internals, especially within the Python ecosystem, empowers you to: Optimize your Python big data applications for peak performance. Troubleshoot issues more efficiently. Write effective Spark code that unlocks its true potential and complements your Python expertise. Whether you're a data scientist, developer, or simply curious about big data, this talk will bridge the gap between Python and Spark.

News
RSS, last 100
Podcast feed of the last two years	SD quality
Podcast audio feed of the last year
Podcast archive feed, everything older than two years	SD quality
Podcast feeds for sps24
mp4	SD quality
webm	SD quality
mp3
opus

News
RSS, last 100
Podcast feed of the last two years	SD quality
Podcast audio feed of the last year
Podcast archive feed, everything older than two years	SD quality
Podcast feeds for sps24
mp4	SD quality
webm	SD quality
mp3
opus