Friday, June 23, 2023

Azure Synapse Analytics - Can 2 different spark notebooks connect to a same spark pool and execute in parallel?

 

Azure synapse analytics – Spark

 

FAQ  1- Can 2 different spark notebooks connect to a same spark pool and execute in parallel?

 

Answer: Yes

I’m having workspace capacity of 80vCores

Taking an example, if you have created a spark pool of Node size: Small (4 vCore – 32 GB size) with 8 nodes.

Total pool size = 32vCores


You can set the number of nodes to be used for each notebook

Notebook 1: Total 3 nodes = (1 driver node and 2 executor nodes)

4vCores x 3 nodes = 12 vCores used



 

Notebook 2:

Total 4 nodes (1 driver node and 3 executor nodes)

4vCores x 4 = 16 vCores used

 

 




12 vCores + 16 vCores = 28 vCores

 

Total of pool size with 32vCores, 28vCores were utilized which is 87.5% utilization.

You can run 2 notebooks having a single Spark pool

 

FAQ 2: Can these share the variables or Temporary views created in their notebooks as they are attached to same pool?

Answer: No

 

Explanation:

Apache Spark for Synapse is designed as a job service and not a cluster model. It creates a separate Apache Spark application to run each notebook.

 



 

 

0 comments:

Post a Comment

Global Certifications: