Skip to content

Commit 9aafed4

Browse files
authored
Merge pull request #40 from tomarv2/test-cluster-name
Test cluster name
2 parents dbb775e + c7ce7a2 commit 9aafed4

File tree

5 files changed

+87
-47
lines changed

5 files changed

+87
-47
lines changed

README.md

Lines changed: 32 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -70,6 +70,10 @@ cluster_access_control = [
7070
{
7171
group_name = "<group_name>"
7272
permission_level = "CAN_RESTART"
73+
},
74+
{
75+
user_name = "<user_name>"
76+
permission_level = "CAN_RESTART"
7377
}
7478
]
7579
```
@@ -110,6 +114,10 @@ policy_access_control = [
110114
{
111115
group_name = "<group_name>"
112116
permission_level = "CAN_USE"
117+
},
118+
{
119+
user_name = "<user_name>"
120+
permission_level = "CAN_USE"
113121
}
114122
]
115123
```
@@ -144,7 +152,11 @@ instance_pool_access_control = [
144152
{
145153
group_name = "<group_name>"
146154
permission_level = "CAN_ATTACH_TO"
147-
}
155+
},
156+
{
157+
user_name = "<user_name>"
158+
permission_level = "CAN_ATTACH_TO"
159+
},
148160
]
149161
```
150162

@@ -183,6 +195,10 @@ jobs_access_control = [
183195
{
184196
group_name = "<group_name>"
185197
permission_level = "CAN_MANAGE_RUN"
198+
},
199+
{
200+
user_name = "<user_name>"
201+
permission_level = "CAN_MANAGE_RUN"
186202
}
187203
]
188204
```
@@ -214,6 +230,10 @@ notebooks_access_control = [
214230
{
215231
group_name = "<group_name>"
216232
permission_level = "CAN_MANAGE"
233+
},
234+
{
235+
user_name = "<user_name>"
236+
permission_level = "CAN_MANAGE"
217237
}
218238
]
219239
```
@@ -238,7 +258,7 @@ terraform destroy -var='teamid=tryme' -var='prjid=project'
238258

239259
#### Recommended method (store remote state in S3 using `prjid` and `teamid` to create directory structure):
240260

241-
- Create python 3.6+ virtual environment
261+
- Create python 3.8+ virtual environment
242262
```
243263
python3 -m venv <venv name>
244264
```
@@ -321,7 +341,6 @@ Error: Failed to delete token in Scope <scope name>
321341
```
322342
Error: Scope <scope name> does not exist!
323343
```
324-
325344
## Requirements
326345

327346
| Name | Version |
@@ -360,8 +379,10 @@ No modules.
360379
| [databricks_notebook.notebook_file_deployment](https://registry.terraform.io/providers/databrickslabs/databricks/latest/docs/resources/notebook) | resource |
361380
| [databricks_permissions.cluster](https://registry.terraform.io/providers/databrickslabs/databricks/latest/docs/resources/permissions) | resource |
362381
| [databricks_permissions.driver_pool](https://registry.terraform.io/providers/databrickslabs/databricks/latest/docs/resources/permissions) | resource |
382+
| [databricks_permissions.existing_cluster_new_job_existing_notebooks](https://registry.terraform.io/providers/databrickslabs/databricks/latest/docs/resources/permissions) | resource |
363383
| [databricks_permissions.existing_cluster_new_job_new_notebooks](https://registry.terraform.io/providers/databrickslabs/databricks/latest/docs/resources/permissions) | resource |
364384
| [databricks_permissions.jobs_notebook](https://registry.terraform.io/providers/databrickslabs/databricks/latest/docs/resources/permissions) | resource |
385+
| [databricks_permissions.new_cluster_new_job_existing_notebooks](https://registry.terraform.io/providers/databrickslabs/databricks/latest/docs/resources/permissions) | resource |
365386
| [databricks_permissions.new_cluster_new_job_new_notebooks](https://registry.terraform.io/providers/databrickslabs/databricks/latest/docs/resources/permissions) | resource |
366387
| [databricks_permissions.notebook](https://registry.terraform.io/providers/databrickslabs/databricks/latest/docs/resources/permissions) | resource |
367388
| [databricks_permissions.policy](https://registry.terraform.io/providers/databrickslabs/databricks/latest/docs/resources/permissions) | resource |
@@ -376,7 +397,7 @@ No modules.
376397

377398
| Name | Description | Type | Default | Required |
378399
|------|-------------|------|---------|:--------:|
379-
| <a name="input_add_instance_profile_to_workspace"></a> [add\_instance\_profile\_to\_workspace](#input\_add\_instance\_profile\_to\_workspace) | Existing AWS instance profile ARN | `any` | `false` | no |
400+
| <a name="input_add_instance_profile_to_workspace"></a> [add\_instance\_profile\_to\_workspace](#input\_add\_instance\_profile\_to\_workspace) | Existing AWS instance profile ARN | `bool` | `false` | no |
380401
| <a name="input_allow_cluster_create"></a> [allow\_cluster\_create](#input\_allow\_cluster\_create) | This is a field to allow the group to have cluster create privileges. More fine grained permissions could be assigned with databricks\_permissions and cluster\_id argument. Everyone without allow\_cluster\_create argument set, but with permission to use Cluster Policy would be able to create clusters, but within boundaries of that specific policy. | `bool` | `true` | no |
381402
| <a name="input_allow_instance_pool_create"></a> [allow\_instance\_pool\_create](#input\_allow\_instance\_pool\_create) | This is a field to allow the group to have instance pool create privileges. More fine grained permissions could be assigned with databricks\_permissions and instance\_pool\_id argument. | `bool` | `true` | no |
382403
| <a name="input_allow_sql_analytics_access"></a> [allow\_sql\_analytics\_access](#input\_allow\_sql\_analytics\_access) | This is a field to allow the group to have access to SQL Analytics feature through databricks\_sql\_endpoint. | `bool` | `true` | no |
@@ -386,6 +407,7 @@ No modules.
386407
| <a name="input_cluster_access_control"></a> [cluster\_access\_control](#input\_cluster\_access\_control) | Cluster access control | `any` | `null` | no |
387408
| <a name="input_cluster_autotermination_minutes"></a> [cluster\_autotermination\_minutes](#input\_cluster\_autotermination\_minutes) | cluster auto termination duration | `number` | `30` | no |
388409
| <a name="input_cluster_id"></a> [cluster\_id](#input\_cluster\_id) | Existing cluster id | `string` | `null` | no |
410+
| <a name="input_cluster_name"></a> [cluster\_name](#input\_cluster\_name) | Cluster name | `string` | `null` | no |
389411
| <a name="input_cluster_policy_id"></a> [cluster\_policy\_id](#input\_cluster\_policy\_id) | Exiting cluster policy id | `string` | `null` | no |
390412
| <a name="input_create_group"></a> [create\_group](#input\_create\_group) | Create a new group, if group already exists the deployment will fail. | `bool` | `false` | no |
391413
| <a name="input_create_user"></a> [create\_user](#input\_create\_user) | Create a new user, if user already exists the deployment will fail. | `bool` | `false` | no |
@@ -400,6 +422,7 @@ No modules.
400422
| <a name="input_deploy_worker_instance_pool"></a> [deploy\_worker\_instance\_pool](#input\_deploy\_worker\_instance\_pool) | Worker instance pool | `bool` | `false` | no |
401423
| <a name="input_driver_node_type_id"></a> [driver\_node\_type\_id](#input\_driver\_node\_type\_id) | The node type of the Spark driver. This field is optional; if unset, API will set the driver node type to the same value as node\_type\_id. | `string` | `null` | no |
402424
| <a name="input_email_notifications"></a> [email\_notifications](#input\_email\_notifications) | Email notification block. | `any` | `null` | no |
425+
| <a name="input_existing_cluster"></a> [existing\_cluster](#input\_existing\_cluster) | Existing job cluster | `bool` | `false` | no |
403426
| <a name="input_fixed_value"></a> [fixed\_value](#input\_fixed\_value) | Number of nodes in the cluster. | `number` | `0` | no |
404427
| <a name="input_gb_per_core"></a> [gb\_per\_core](#input\_gb\_per\_core) | Number of gigabytes per core available on instance. Conflicts with min\_memory\_gb. Defaults to 0. | `string` | `0` | no |
405428
| <a name="input_gpu"></a> [gpu](#input\_gpu) | GPU required or not. | `bool` | `false` | no |
@@ -408,13 +431,12 @@ No modules.
408431
| <a name="input_group_can_restart"></a> [group\_can\_restart](#input\_group\_can\_restart) | Group allowed to access the platform. | `string` | `""` | no |
409432
| <a name="input_idle_instance_autotermination_minutes"></a> [idle\_instance\_autotermination\_minutes](#input\_idle\_instance\_autotermination\_minutes) | idle instance auto termination duration | `number` | `20` | no |
410433
| <a name="input_instance_pool_access_control"></a> [instance\_pool\_access\_control](#input\_instance\_pool\_access\_control) | Instance pool access control | `any` | `null` | no |
411-
| <a name="input_instance_profile_arn"></a> [instance\_profile\_arn](#input\_instance\_profile\_arn) | ARN attribute of aws\_iam\_instance\_profile output, the EC2 instance profile association to AWS IAM role. This ARN would be validated upon resource creation and it's not possible to skip validation. | `any` | `null` | no |
412434
| <a name="input_is_meta_instance_profile"></a> [is\_meta\_instance\_profile](#input\_is\_meta\_instance\_profile) | Whether the instance profile is a meta instance profile. Used only in IAM credential passthrough. | `any` | `false` | no |
413435
| <a name="input_jobs_access_control"></a> [jobs\_access\_control](#input\_jobs\_access\_control) | Jobs access control | `any` | `null` | no |
414436
| <a name="input_language"></a> [language](#input\_language) | notebook language | `string` | `"PYTHON"` | no |
415437
| <a name="input_local_disk"></a> [local\_disk](#input\_local\_disk) | Pick only nodes with local storage. Defaults to false. | `string` | `true` | no |
416-
| <a name="input_local_notebooks"></a> [local\_notebooks](#input\_local\_notebooks) | nested block: NestingSet, min items: 0, max items: 0 | `any` | `[]` | no |
417-
| <a name="input_local_path"></a> [local\_path](#input\_local\_path) | notebook location on user machine | `string` | `null` | no |
438+
| <a name="input_local_notebooks"></a> [local\_notebooks](#input\_local\_notebooks) | Local path to the notebook(s) that will be used by the job | `any` | `[]` | no |
439+
| <a name="input_local_path"></a> [local\_path](#input\_local\_path) | Notebook(s) location on users machine | `string` | `null` | no |
418440
| <a name="input_max_capacity"></a> [max\_capacity](#input\_max\_capacity) | instance pool maximum capacity | `number` | `3` | no |
419441
| <a name="input_max_concurrent_runs"></a> [max\_concurrent\_runs](#input\_max\_concurrent\_runs) | An optional maximum allowed number of concurrent runs of the job. | `number` | `null` | no |
420442
| <a name="input_max_retries"></a> [max\_retries](#input\_max\_retries) | An optional maximum number of times to retry an unsuccessful run. A run is considered to be unsuccessful if it completes with a FAILED result\_state or INTERNAL\_ERROR life\_cycle\_state. The value -1 means to retry indefinitely and the value 0 means to never retry. The default behavior is to never retry. | `number` | `0` | no |
@@ -424,13 +446,13 @@ No modules.
424446
| <a name="input_min_memory_gb"></a> [min\_memory\_gb](#input\_min\_memory\_gb) | Minimum amount of memory per node in gigabytes. Defaults to 0. | `string` | `0` | no |
425447
| <a name="input_min_retry_interval_millis"></a> [min\_retry\_interval\_millis](#input\_min\_retry\_interval\_millis) | An optional minimal interval in milliseconds between the start of the failed run and the subsequent retry run. The default behavior is that unsuccessful runs are immediately retried. | `number` | `null` | no |
426448
| <a name="input_ml"></a> [ml](#input\_ml) | ML required or not. | `bool` | `false` | no |
427-
| <a name="input_notebooks_access_control"></a> [notebook\_access\_control](#input\_notebook\_access\_control) | Notebook access control | `any` | `null` | no |
428-
| <a name="input_notebooks"></a> [notebooks](#input\_notebooks) | nested block: NestingSet, min items: 0, max items: 0 | `any` | `[]` | no |
449+
| <a name="input_notebooks"></a> [notebooks](#input\_notebooks) | Local path to the notebook(s) that will be deployed | `any` | `[]` | no |
450+
| <a name="input_notebooks_access_control"></a> [notebooks\_access\_control](#input\_notebooks\_access\_control) | Notebook access control | `any` | `null` | no |
429451
| <a name="input_num_workers"></a> [num\_workers](#input\_num\_workers) | number of workers for job | `number` | `1` | no |
430452
| <a name="input_policy_access_control"></a> [policy\_access\_control](#input\_policy\_access\_control) | Policy access control | `any` | `null` | no |
431453
| <a name="input_policy_overrides"></a> [policy\_overrides](#input\_policy\_overrides) | Cluster policy overrides | `any` | `null` | no |
432454
| <a name="input_prjid"></a> [prjid](#input\_prjid) | (Required) Name of the project/stack e.g: mystack, nifieks, demoaci. Should not be changed after running 'tf apply' | `string` | n/a | yes |
433-
| <a name="input_remote_notebooks"></a> [remote\_notebooks](#input\_remote\_notebooks) | nested block: NestingSet, min items: 0, max items: 0 | `any` | `[]` | no |
455+
| <a name="input_remote_notebooks"></a> [remote\_notebooks](#input\_remote\_notebooks) | Path to notebook(s) in the databricks workspace that will be used by the job | `any` | `[]` | no |
434456
| <a name="input_retry_on_timeout"></a> [retry\_on\_timeout](#input\_retry\_on\_timeout) | An optional policy to specify whether to retry a job when it times out. The default behavior is to not retry on timeout. | `bool` | `false` | no |
435457
| <a name="input_schedule"></a> [schedule](#input\_schedule) | Job schedule configuration. | `map(any)` | `null` | no |
436458
| <a name="input_spark_conf"></a> [spark\_conf](#input\_spark\_conf) | Optional Spark configuration block | `any` | `null` | no |

cluster.tf

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ resource "databricks_instance_profile" "shared" {
1111
resource "databricks_cluster" "cluster" {
1212
count = (var.deploy_cluster == true && (var.fixed_value != 0 || var.auto_scaling != null) ? 1 : 0)
1313

14-
cluster_name = "${var.teamid}-${var.prjid} (Terraform managed)"
14+
cluster_name = var.cluster_name != null ? var.cluster_name : "${var.teamid}-${var.prjid} (Terraform managed)"
1515

1616
policy_id = var.cluster_policy_id == null && var.deploy_cluster_policy == false ? null : local.cluster_policy_id
1717
spark_version = var.spark_version != null ? var.spark_version : data.databricks_spark_version.latest.id
@@ -47,8 +47,10 @@ resource "databricks_cluster" "cluster" {
4747
}
4848

4949
resource "databricks_cluster" "single_node_cluster" {
50-
count = var.deploy_cluster == true && var.fixed_value == 0 && var.auto_scaling == null ? 1 : 0
51-
cluster_name = "${var.teamid}-${var.prjid} (Terraform managed)"
50+
count = var.deploy_cluster == true && var.fixed_value == 0 && var.auto_scaling == null ? 1 : 0
51+
52+
cluster_name = var.cluster_name != null ? var.cluster_name : "${var.teamid}-${var.prjid} (Terraform managed)"
53+
5254
spark_version = var.spark_version != null ? var.spark_version : data.databricks_spark_version.latest.id
5355
node_type_id = var.deploy_worker_instance_pool != true ? local.driver_node_type : null
5456
num_workers = 0

job.tf

Lines changed: 18 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -3,15 +3,12 @@
33
# 1. NEW CLUSTER WITH NEW NOTEBOOKS
44
# ------------------------------------------------
55
resource "databricks_job" "new_cluster_new_job_new_notebooks" {
6-
for_each = (var.deploy_jobs == true && var.cluster_id == null && var.local_notebooks != null) ? { for p in var.local_notebooks : "${p.job_name}-${p.local_path}" => p } : {}
6+
for_each = (var.deploy_jobs == true && var.existing_cluster == false && var.local_notebooks != null) ? { for p in var.local_notebooks : "${p.job_name}-${p.local_path}" => p } : {}
7+
#for_each = (var.deploy_jobs == true && var.cluster_id == null && var.local_notebooks != null) ? { for p in var.local_notebooks : "${p.job_name}-${p.local_path}" => p } : {}
78

89
name = "${each.value.job_name} (Terraform managed)"
910

10-
new_cluster {
11-
num_workers = var.num_workers
12-
spark_version = data.databricks_spark_version.latest.id
13-
node_type_id = join("", data.databricks_node_type.cluster_node_type.*.id)
14-
}
11+
existing_cluster_id = local.cluster_info
1512

1613
notebook_task {
1714
notebook_path = lookup(each.value, "path", "${data.databricks_current_user.me.home}/${each.value.job_name}")
@@ -43,18 +40,20 @@ resource "databricks_job" "new_cluster_new_job_new_notebooks" {
4340
}
4441
}
4542
}
46-
4743
# ------------------------------------------------
48-
# 2. EXISTING CLUSTER WITH NEW NOTEBOOKS
44+
# 2. NEW CLUSTER WITH EXITING NOTEBOOKS
4945
# ------------------------------------------------
50-
resource "databricks_job" "existing_cluster_new_job_new_notebooks" {
51-
for_each = (var.deploy_jobs == true && var.cluster_id != null && var.local_notebooks != null) ? { for p in var.local_notebooks : "${p.job_name}-${p.local_path}" => p } : {}
46+
resource "databricks_job" "new_cluster_new_job_existing_notebooks" {
47+
for_each = (var.deploy_jobs == true && var.existing_cluster == false && var.remote_notebooks != null) ? { for p in var.remote_notebooks : "${p.job_name}-${p.path}" => p } : {}
48+
#for_each = (var.deploy_jobs == true && var.cluster_id == null && var.remote_notebooks != null) ? { for p in var.remote_notebooks : "${p.job_name}-${p.path}" => p } : {}
49+
50+
51+
name = "${each.value.job_name} (Terraform managed)"
5252

53-
name = "${each.value.job_name} (Terraform managed)"
5453
existing_cluster_id = local.cluster_info
5554

5655
notebook_task {
57-
notebook_path = lookup(each.value, "path", "${data.databricks_current_user.me.home}/${each.value.job_name}")
56+
notebook_path = lookup(each.value, "path")
5857
base_parameters = var.task_parameters
5958
}
6059

@@ -83,22 +82,18 @@ resource "databricks_job" "existing_cluster_new_job_new_notebooks" {
8382
}
8483
}
8584
}
85+
8686
# ------------------------------------------------
87-
# 3. NEW CLUSTER WITH EXITING NOTEBOOKS
87+
# 3. EXISTING CLUSTER WITH NEW NOTEBOOKS
8888
# ------------------------------------------------
89-
resource "databricks_job" "new_cluster_new_job_existing_notebooks" {
90-
for_each = (var.deploy_jobs == true && var.cluster_id == null && var.remote_notebooks != null) ? { for p in var.remote_notebooks : "${p.job_name}-${p.path}" => p } : {}
91-
92-
name = "${each.value.job_name} (Terraform managed)"
89+
resource "databricks_job" "existing_cluster_new_job_new_notebooks" {
90+
for_each = (var.deploy_jobs == true && var.cluster_id != null && var.local_notebooks != null) ? { for p in var.local_notebooks : "${p.job_name}-${p.local_path}" => p } : {}
9391

94-
new_cluster {
95-
num_workers = var.num_workers
96-
spark_version = data.databricks_spark_version.latest.id
97-
node_type_id = join("", data.databricks_node_type.cluster_node_type.*.id)
98-
}
92+
name = "${each.value.job_name} (Terraform managed)"
93+
existing_cluster_id = local.cluster_info
9994

10095
notebook_task {
101-
notebook_path = lookup(each.value, "path")
96+
notebook_path = lookup(each.value, "path", "${data.databricks_current_user.me.home}/${each.value.job_name}")
10297
base_parameters = var.task_parameters
10398
}
10499

@@ -127,7 +122,6 @@ resource "databricks_job" "new_cluster_new_job_existing_notebooks" {
127122
}
128123
}
129124
}
130-
131125
# ------------------------------------------------
132126
# 4. EXISTING CLUSTER WITH EXITING NOTEBOOKS
133127
# ------------------------------------------------

0 commit comments

Comments
 (0)