Azure Synapse Analytics - Can 2 different spark notebooks connect to a same spark pool and execute in parallel? ~ Shanmukh Sattiraju

Blog Viewers

Friday, June 23, 2023

Azure synapse analytics – Spark

FAQ 1- Can 2 different spark notebooks connect to a same spark pool and execute in parallel?

Answer: Yes

I’m having workspace capacity of 80vCores

Taking an example, if you have created a spark pool of Node size: Small (4 vCore – 32 GB size) with 8 nodes.

Total pool size = 32vCores

You can set the number of nodes to be used for each notebook

Notebook 1: Total 3 nodes = (1 driver node and 2 executor nodes)

4vCores x 3 nodes = 12 vCores used

Notebook 2:

Total 4 nodes (1 driver node and 3 executor nodes)

4vCores x 4 = 16 vCores used

12 vCores + 16 vCores = 28 vCores

Total of pool size with 32vCores, 28vCores were utilized which is 87.5% utilization.

You can run 2 notebooks having a single Spark pool

FAQ 2: Can these share the variables or Temporary views created in their notebooks as they are attached to same pool?

Answer: No

Explanation:

Apache Spark for Synapse is designed as a job service and not a cluster model. It creates a separate Apache Spark application to run each notebook.