June 2023 ~ Shanmukh Sattiraju

Blog Viewers

Friday, June 23, 2023

Azure Synapse Analytics - Can 2 different spark notebooks connect to a same spark pool and execute in parallel?

By Shanmukh Sattiraju

Azure synapse analytics – Spark

FAQ 1- Can 2 different spark notebooks connect to a same spark pool and execute in parallel?

Answer: Yes

I’m having workspace capacity of 80vCores

Taking an example, if you have created a spark pool of Node size: Small (4 vCore – 32 GB size) with 8 nodes.

Total pool size = 32vCores

You can set the number of nodes to be used for each notebook

Notebook 1: Total 3 nodes = (1 driver node and 2 executor nodes)

4vCores x 3 nodes = 12 vCores used

Notebook 2:

Total 4 nodes (1 driver node and 3 executor nodes)

4vCores x 4 = 16 vCores used

12 vCores + 16 vCores = 28 vCores

Total of pool size with 32vCores, 28vCores were utilized which is 87.5% utilization.

You can run 2 notebooks having a single Spark pool

FAQ 2: Can these share the variables or Temporary views created in their notebooks as they are attached to same pool?

Answer: No

Explanation:

Apache Spark for Synapse is designed as a job service and not a cluster model. It creates a separate Apache Spark application to run each notebook.

Azure Data Factory + Azure Synapse Analytics - END to END development Project course - Grab 50% OFF COUPON and ENROLL NOW!

By Shanmukh Sattiraju

Coupon Code link:

https://www.udemy.com/course/azure-data-factory-synapse-analytics-end-to-end-etl-project/?couponCode=NEWJUNE60

With 450+ Students join this course with above Link for can access the course with 50% OFF!!!

Throughout this course, you'll gain practical hands-on experience with Azure Data Factory and Azure Synapse Analytics, learning how to use these powerful data engineering tools to create a highly effective ETL solution. You'll explore the many features and capabilities of these platforms, as well as their integration with other Azure services like

1. Azure SQL Database

2. Azure Synapse Analytics

3. Azure Key Vault

4. Azure Data Factory for Orchestration,

5. Azure Storage solutions (Azure Datalake Gen2)

6. Microsoft Power BI

7. Azure Logic Apps

============================

Linkedln : https://www.linkedin.com/in/shanmukh-sattiraju/

View the below video for project architecture:

Azure Synapse Analytics - Reading files from Azure Datalake and Writing to ADLS using PySpark

By Shanmukh Sattiraju

Accessing storage account from Azure Synapse Analytics

This can be directly accessed using Linked service,

With linked service we can access by "Account key" or by "User assigned Managed Identity"

Microsoft reference Documentation link:

https://learn.microsoft.com/en-us/azure/synapse-analytics/spark/apache-spark-secure-credentials-with-tokenlibrary?pivots=programming-language-python

Shanmukh Sattiraju

Evolve ourselves along with the trending technology by learning and enhance the skill set to master it.

Blog Viewers

Friday, June 23, 2023

Azure Synapse Analytics - Can 2 different spark notebooks connect to a same spark pool and execute in parallel?

Friday, June 09, 2023

Azure Data Factory + Azure Synapse Analytics - END to END development Project course - Grab 50% OFF COUPON and ENROLL NOW!

Azure Synapse Analytics - Reading files from Azure Datalake and Writing to ADLS using PySpark

Global Certifications:

Search

Popular Posts

Recent Posts

Text Widget

Pages

Blog Archive

About Me