Skip to content

Commit f84b4fd

Browse files
authored
Updated Dotnet and Arcade (#1197)
* Updated .NET6 to .NET8, .net framework 4.6 to 4.8, and dependencies. * Fixed build failure, logic for bulding path to a test folder, added a few git-ignores. * Copy-pasted Arcade 8 common folder * Update of Arcade: Modified versions, and merged default configuration of Arcade with Dotnet.Spark * Removed unneeded vs code configuration file, and added it to gitignore
1 parent 7b9224a commit f84b4fd

File tree

196 files changed

+7949
-4364
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

196 files changed

+7949
-4364
lines changed

.gitignore

+9-1
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,8 @@ bld/
2727

2828
# Visual Studio 2015/2017 cache/options directory
2929
.vs/
30+
# Visual Studio Code configuration directory
31+
.vscode/
3032
# Uncomment if you have tasks that create the project's static files in wwwroot
3133
#wwwroot/
3234

@@ -185,6 +187,7 @@ PublishScripts/
185187
*.snupkg
186188
# The packages folder can be ignored because of Package Restore
187189
**/[Pp]ackages/*
190+
.packages/*
188191
# except build/, which is used as an MSBuild target.
189192
!**/[Pp]ackages/build/
190193
# Uncomment if necessary however generally it will be regenerated when needed
@@ -372,4 +375,9 @@ hs_err_pid*
372375
.ionide/
373376

374377
# Mac dev
375-
.DS_Store
378+
.DS_Store
379+
380+
# Scala intermideate build files
381+
**/.bloop/
382+
**/.metals/
383+
**/.bsp/**

LICENSE

+4-2
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
1-
MIT License
1+
The MIT License (MIT)
22

3-
Copyright (c) 2019 .NET Foundation
3+
Copyright (c) .NET Foundation and Contributors
4+
5+
All rights reserved.
46

57
Permission is hereby granted, free of charge, to any person obtaining a copy
68
of this software and associated documentation files (the "Software"), to deal

NuGet.config

+1-1
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,6 @@
55
<add key="dotnet-public" value="https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-public/nuget/v3/index.json" />
66
<add key="dotnet-tools" value="https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-tools/nuget/v3/index.json" />
77
<add key="dotnet-eng" value="https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-eng/nuget/v3/index.json" />
8-
<add key="dotnet5" value="https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet5/nuget/v3/index.json" />
8+
<add key="dotnet8" value="https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet8/nuget/v3/index.json" />
99
</packageSources>
1010
</configuration>

README.md

+8-5
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88

99
.NET for Apache Spark is compliant with .NET Standard - a formal specification of .NET APIs that are common across .NET implementations. This means you can use .NET for Apache Spark anywhere you write .NET code allowing you to reuse all the knowledge, skills, code, and libraries you already have as a .NET developer.
1010

11-
.NET for Apache Spark runs on Windows, Linux, and macOS using .NET 6, or Windows using .NET Framework. It also runs on all major cloud providers including [Azure HDInsight Spark](deployment/README.md#azure-hdinsight-spark), [Amazon EMR Spark](deployment/README.md#amazon-emr-spark), [AWS](deployment/README.md#databricks) & [Azure](deployment/README.md#databricks) Databricks.
11+
.NET for Apache Spark runs on Windows, Linux, and macOS using .NET 8, or Windows using .NET Framework. It also runs on all major cloud providers including [Azure HDInsight Spark](deployment/README.md#azure-hdinsight-spark), [Amazon EMR Spark](deployment/README.md#amazon-emr-spark), [AWS](deployment/README.md#databricks) & [Azure](deployment/README.md#databricks) Databricks.
1212

1313
**Note**: We currently have a Spark Project Improvement Proposal JIRA at [SPIP: .NET bindings for Apache Spark](https://issues.apache.org/jira/browse/SPARK-27006) to work with the community towards getting .NET support by default into Apache Spark. We highly encourage you to participate in the discussion.
1414

@@ -40,7 +40,7 @@
4040
<tbody align="center">
4141
<tr>
4242
<td>2.4*</td>
43-
<td rowspan=4><a href="https://github.com/dotnet/spark/releases/tag/v2.1.1">v2.1.1</a></td>
43+
<td rowspan=5><a href="https://github.com/dotnet/spark/releases/tag/v2.1.1">v2.1.1</a></td>
4444
</tr>
4545
<tr>
4646
<td>3.0</td>
@@ -50,6 +50,9 @@
5050
</tr>
5151
<tr>
5252
<td>3.2</td>
53+
</tr>
54+
<tr>
55+
<td>3.5</td>
5356
</tr>
5457
</tbody>
5558
</table>
@@ -61,7 +64,7 @@
6164
.NET for Apache Spark releases are available [here](https://github.com/dotnet/spark/releases) and NuGet packages are available [here](https://www.nuget.org/packages/Microsoft.Spark).
6265

6366
## Get Started
64-
These instructions will show you how to run a .NET for Apache Spark app using .NET 6.
67+
These instructions will show you how to run a .NET for Apache Spark app using .NET 8.
6568
- [Windows Instructions](docs/getting-started/windows-instructions.md)
6669
- [Ubuntu Instructions](docs/getting-started/ubuntu-instructions.md)
6770
- [MacOs Instructions](docs/getting-started/macos-instructions.md)
@@ -79,8 +82,8 @@ Building from source is very easy and the whole process (from cloning to being a
7982

8083
| | | Instructions |
8184
| :---: | :--- | :--- |
82-
| ![Windows icon](docs/img/windows-icon-32.png) | **Windows** | <ul><li>Local - [.NET Framework 4.6.1](docs/building/windows-instructions.md#using-visual-studio-for-net-framework-461)</li><li>Local - [.NET 6](docs/building/windows-instructions.md#using-net-core-cli-for-net-core)</li><ul> |
83-
| ![Ubuntu icon](docs/img/ubuntu-icon-32.png) | **Ubuntu** | <ul><li>Local - [.NET 6](docs/building/ubuntu-instructions.md)</li><li>[Azure HDInsight Spark - .NET 6](deployment/README.md)</li></ul> |
85+
| ![Windows icon](docs/img/windows-icon-32.png) | **Windows** | <ul><li>Local - [.NET Framework 4.8](docs/building/windows-instructions.md#using-visual-studio-for-net-framework)</li><li>Local - [.NET 8](docs/building/windows-instructions.md#using-net-core-cli-for-net-core)</li><ul> |
86+
| ![Ubuntu icon](docs/img/ubuntu-icon-32.png) | **Ubuntu** | <ul><li>Local - [.NET 8](docs/building/ubuntu-instructions.md)</li><li>[Azure HDInsight Spark - .NET 8](deployment/README.md)</li></ul> |
8487

8588
<a name="samples"></a>
8689
## Samples

azure-pipelines-e2e-tests-template.yml

+3-3
Original file line numberDiff line numberDiff line change
@@ -65,10 +65,10 @@ stages:
6565
mvn -version
6666
6767
- task: UseDotNet@2
68-
displayName: 'Use .NET 6 sdk'
68+
displayName: 'Use .NET 8 sdk'
6969
inputs:
7070
packageType: sdk
71-
version: 6.x
71+
version: 8.x
7272
installationPath: $(Agent.ToolsDirectory)/dotnet
7373

7474
- task: DownloadPipelineArtifact@2
@@ -78,7 +78,7 @@ stages:
7878
artifactName: Microsoft.Spark.Binaries
7979

8080
- pwsh: |
81-
$framework = "net6.0"
81+
$framework = "net8.0"
8282
8383
if ($env:AGENT_OS -eq 'Windows_NT') {
8484
$runtimeIdentifier = "win-x64"

azure-pipelines-pr.yml

-1
Original file line numberDiff line numberDiff line change
@@ -151,7 +151,6 @@ extends:
151151
- script: build.cmd -pack
152152
-c $(buildConfiguration)
153153
-ci
154-
$(_OfficialBuildIdArgs)
155154
/p:PublishSparkWorker=true
156155
/p:SparkWorkerPublishDir=$(Build.ArtifactStagingDirectory)\Microsoft.Spark.Worker
157156
displayName: '.NET build'

azure-pipelines.yml

+2
Original file line numberDiff line numberDiff line change
@@ -133,6 +133,8 @@ stages:
133133
variables:
134134
${{ if and(ne(variables['System.TeamProject'], 'public'), notin(variables['Build.Reason'], 'PullRequest')) }}:
135135
_OfficialBuildIdArgs: /p:OfficialBuildId=$(BUILD.BUILDNUMBER)
136+
${{ else }}:
137+
_OfficialBuildIdArgs: ''
136138

137139
steps:
138140
- task: DownloadBuildArtifacts@0

benchmark/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,7 @@ TPCH timing results is written to stdout in the following form: `TPCH_Result,<la
6060
<true for sql tests, false for functional tests>
6161
```
6262
63-
**Note**: Ensure that you build the worker and application with .NET 6 in order to run hardware acceleration queries.
63+
**Note**: Ensure that you build the worker and application with .NET 8 in order to run hardware acceleration queries.
6464
6565
## Python
6666
1. Upload [run_python_benchmark.sh](run_python_benchmark.sh) and all [python tpch benchmark](python/) files to the cluster.

benchmark/csharp/Tpch/Tpch.csproj

+3-3
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,8 @@
22

33
<PropertyGroup>
44
<OutputType>Exe</OutputType>
5-
<TargetFrameworks>net461;net6.0</TargetFrameworks>
6-
<TargetFrameworks Condition="'$(OS)' != 'Windows_NT'">net6.0</TargetFrameworks>
5+
<TargetFrameworks>net48;net8.0</TargetFrameworks>
6+
<TargetFrameworks Condition="'$(OS)' != 'Windows_NT'">net8.0</TargetFrameworks>
77
<RootNamespace>Tpch</RootNamespace>
88
<AssemblyName>Tpch</AssemblyName>
99
</PropertyGroup>
@@ -16,7 +16,7 @@
1616
</ItemGroup>
1717

1818
<Choose>
19-
<When Condition="'$(TargetFramework)' == 'net6.0'">
19+
<When Condition="'$(TargetFramework)' == 'net8.0'">
2020
<PropertyGroup>
2121
<AllowUnsafeBlocks>true</AllowUnsafeBlocks>
2222
</PropertyGroup>

deployment/README.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,7 @@ Microsoft.Spark.Worker is a backend component that lives on the individual worke
6363
## Azure HDInsight Spark
6464
[Azure HDInsight Spark](https://docs.microsoft.com/en-us/azure/hdinsight/spark/apache-spark-overview) is the Microsoft implementation of Apache Spark in the cloud that allows users to launch and configure Spark clusters in Azure. You can use HDInsight Spark clusters to process your data stored in Azure (e.g., [Azure Storage](https://azure.microsoft.com/en-us/services/storage/) and [Azure Data Lake Storage](https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-introduction)).
6565

66-
> **Note:** Azure HDInsight Spark is Linux-based. Therefore, if you are interested in deploying your app to Azure HDInsight Spark, make sure your app is .NET Standard compatible and that you use [.NET 6 compiler](https://dotnet.microsoft.com/download) to compile your app.
66+
> **Note:** Azure HDInsight Spark is Linux-based. Therefore, if you are interested in deploying your app to Azure HDInsight Spark, make sure your app is .NET Standard compatible and that you use [.NET 8 compiler](https://dotnet.microsoft.com/download) to compile your app.
6767
6868
### Deploy Microsoft.Spark.Worker
6969
*Note that this step is required only once*
@@ -115,7 +115,7 @@ EOF
115115
## Amazon EMR Spark
116116
[Amazon EMR](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-what-is-emr.html) is a managed cluster platform that simplifies running big data frameworks on AWS.
117117
118-
> **Note:** AWS EMR Spark is Linux-based. Therefore, if you are interested in deploying your app to AWS EMR Spark, make sure your app is .NET Standard compatible and that you use [.NET 6 compiler](https://dotnet.microsoft.com/download) to compile your app.
118+
> **Note:** AWS EMR Spark is Linux-based. Therefore, if you are interested in deploying your app to AWS EMR Spark, make sure your app is .NET Standard compatible and that you use [.NET 8 compiler](https://dotnet.microsoft.com/download) to compile your app.
119119
120120
### Deploy Microsoft.Spark.Worker
121121
*Note that this step is only required at cluster creation*
@@ -160,7 +160,7 @@ foo@bar:~$ aws emr add-steps \
160160
## Databricks
161161
[Databricks](http://databricks.com) is a platform that provides cloud-based big data processing using Apache Spark.
162162
163-
> **Note:** [Azure](https://azure.microsoft.com/en-us/services/databricks/) and [AWS](https://databricks.com/aws) Databricks is Linux-based. Therefore, if you are interested in deploying your app to Databricks, make sure your app is .NET Standard compatible and that you use [.NET 6 compiler](https://dotnet.microsoft.com/download) to compile your app.
163+
> **Note:** [Azure](https://azure.microsoft.com/en-us/services/databricks/) and [AWS](https://databricks.com/aws) Databricks is Linux-based. Therefore, if you are interested in deploying your app to Databricks, make sure your app is .NET Standard compatible and that you use [.NET 8 compiler](https://dotnet.microsoft.com/download) to compile your app.
164164
165165
Databricks allows you to submit Spark .NET apps to an existing active cluster or create a new cluster everytime you launch a job. This requires the **Microsoft.Spark.Worker** to be installed **first** before you submit a Spark .NET app.
166166

docs/building/ubuntu-instructions.md

+20-17
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,13 @@ Building Spark .NET on Ubuntu 18.04
22
==========================
33

44
# Table of Contents
5-
- [Open Issues](#open-issues)
6-
- [Pre-requisites](#pre-requisites)
5+
- [Building Spark .NET on Ubuntu 18.04](#building-spark-net-on-ubuntu-1804)
6+
- [Table of Contents](#table-of-contents)
7+
- [Open Issues:](#open-issues)
8+
- [Pre-requisites:](#pre-requisites)
79
- [Building](#building)
810
- [Building Spark .NET Scala Extensions Layer](#building-spark-net-scala-extensions-layer)
9-
- [Building .NET Sample Applications using .NET Core CLI](#building-net-sample-applications-using-net-core-cli)
11+
- [Building .NET Sample Applications using .NET 8 CLI](#building-net-sample-applications-using-net-8-cli)
1012
- [Run Samples](#run-samples)
1113

1214
# Open Issues:
@@ -16,7 +18,7 @@ Building Spark .NET on Ubuntu 18.04
1618

1719
If you already have all the pre-requisites, skip to the [build](ubuntu-instructions.md#building) steps below.
1820

19-
1. Download and install **[.NET 6 SDK](https://dotnet.microsoft.com/en-us/download/dotnet/6.0)** - installing the SDK will add the `dotnet` toolchain to your path.
21+
1. Download and install **[.NET 8 SDK](https://dotnet.microsoft.com/en-us/download/dotnet/8.0)** - installing the SDK will add the `dotnet` toolchain to your path.
2022
2. Install **[OpenJDK 8](https://openjdk.java.net/install/)**
2123
- You can use the following command:
2224
```bash
@@ -110,65 +112,66 @@ Let us now build the Spark .NET Scala extension layer. This is easy to do:
110112
111113
```
112114
cd src/scala
113-
mvn clean package
115+
mvn clean package
114116
```
115117
You should see JARs created for the supported Spark versions:
116118
* `microsoft-spark-2-3/target/microsoft-spark-2-3_2.11-<version>.jar`
117119
* `microsoft-spark-2-4/target/microsoft-spark-2-4_2.11-<version>.jar`
118120
* `microsoft-spark-3-0/target/microsoft-spark-3-0_2.12-<version>.jar`
121+
* `microsoft-spark-3-0/target/microsoft-spark-3-5_2.12-<version>.jar`
119122
120-
## Building .NET Sample Applications using .NET 6 CLI
123+
## Building .NET Sample Applications using .NET 8 CLI
121124
122125
1. Build the Worker
123126
```bash
124127
cd ~/dotnet.spark/src/csharp/Microsoft.Spark.Worker/
125-
dotnet publish -f net6.0 -r linux-x64
128+
dotnet publish -f net8.0 -r linux-x64
126129
```
127130
<details>
128131
<summary>&#x1F4D9; Click to see sample console output</summary>
129132
130133
```bash
131-
user@machine:/home/user/dotnet.spark/src/csharp/Microsoft.Spark.Worker$ dotnet publish -f net6.0 -r linux-x64
134+
user@machine:/home/user/dotnet.spark/src/csharp/Microsoft.Spark.Worker$ dotnet publish -f net8.0 -r linux-x64
132135
Microsoft (R) Build Engine version 16.0.462+g62fb89029d for .NET Core
133136
Copyright (C) Microsoft Corporation. All rights reserved.
134137
135138
Restore completed in 36.03 ms for /home/user/dotnet.spark/src/csharp/Microsoft.Spark.Worker/Microsoft.Spark.Worker.csproj.
136139
Restore completed in 35.94 ms for /home/user/dotnet.spark/src/csharp/Microsoft.Spark/Microsoft.Spark.csproj.
137140
Microsoft.Spark -> /home/user/dotnet.spark/artifacts/bin/Microsoft.Spark/Debug/netstandard2.0/Microsoft.Spark.dll
138-
Microsoft.Spark.Worker -> /home/user/dotnet.spark/artifacts/bin/Microsoft.Spark.Worker/Debug/net6.0/linux-x64/Microsoft.Spark.Worker.dll
139-
Microsoft.Spark.Worker -> /home/user/dotnet.spark/artifacts/bin/Microsoft.Spark.Worker/Debug/net6.0/linux-x64/publish/
141+
Microsoft.Spark.Worker -> /home/user/dotnet.spark/artifacts/bin/Microsoft.Spark.Worker/Debug/net8.0/linux-x64/Microsoft.Spark.Worker.dll
142+
Microsoft.Spark.Worker -> /home/user/dotnet.spark/artifacts/bin/Microsoft.Spark.Worker/Debug/net8.0/linux-x64/publish/
140143
```
141144
142145
</details>
143146
144147
2. Build the Samples
145148
```bash
146149
cd ~/dotnet.spark/examples/Microsoft.Spark.CSharp.Examples/
147-
dotnet publish -f net6.0 -r linux-x64
150+
dotnet publish -f net8.0 -r linux-x64
148151
```
149152
<details>
150153
<summary>&#x1F4D9; Click to see sample console output</summary>
151154
152155
```bash
153-
user@machine:/home/user/dotnet.spark/examples/Microsoft.Spark.CSharp.Examples$ dotnet publish -f net6.0 -r linux-x64
156+
user@machine:/home/user/dotnet.spark/examples/Microsoft.Spark.CSharp.Examples$ dotnet publish -f net8.0 -r linux-x64
154157
Microsoft (R) Build Engine version 16.0.462+g62fb89029d for .NET Core
155158
Copyright (C) Microsoft Corporation. All rights reserved.
156159
157160
Restore completed in 37.11 ms for /home/user/dotnet.spark/src/csharp/Microsoft.Spark/Microsoft.Spark.csproj.
158161
Restore completed in 281.63 ms for /home/user/dotnet.spark/examples/Microsoft.Spark.CSharp.Examples/Microsoft.Spark.CSharp.Examples.csproj.
159162
Microsoft.Spark -> /home/user/dotnet.spark/artifacts/bin/Microsoft.Spark/Debug/netstandard2.0/Microsoft.Spark.dll
160-
Microsoft.Spark.CSharp.Examples -> /home/user/dotnet.spark/artifacts/bin/Microsoft.Spark.CSharp.Examples/Debug/net6.0/linux-x64/Microsoft.Spark.CSharp.Examples.dll
161-
Microsoft.Spark.CSharp.Examples -> /home/user/dotnet.spark/artifacts/bin/Microsoft.Spark.CSharp.Examples/Debug/net6.0/linux-x64/publish/
163+
Microsoft.Spark.CSharp.Examples -> /home/user/dotnet.spark/artifacts/bin/Microsoft.Spark.CSharp.Examples/Debug/net8.0/linux-x64/Microsoft.Spark.CSharp.Examples.dll
164+
Microsoft.Spark.CSharp.Examples -> /home/user/dotnet.spark/artifacts/bin/Microsoft.Spark.CSharp.Examples/Debug/net8.0/linux-x64/publish/
162165
```
163166
164167
</details>
165168
166169
# Run Samples
167170
168-
Once you build the samples, you can use `spark-submit` to submit your .NET 6 apps. Make sure you have followed the [pre-requisites](#pre-requisites) section and installed Apache Spark.
171+
Once you build the samples, you can use `spark-submit` to submit your .NET 8 apps. Make sure you have followed the [pre-requisites](#pre-requisites) section and installed Apache Spark.
169172
170-
1. Set the `DOTNET_WORKER_DIR` or `PATH` environment variable to include the path where the `Microsoft.Spark.Worker` binary has been generated (e.g., `~/dotnet.spark/artifacts/bin/Microsoft.Spark.Worker/Debug/net6.0/linux-x64/publish`)
171-
2. Open a terminal and go to the directory where your app binary has been generated (e.g., `~/dotnet.spark/artifacts/bin/Microsoft.Spark.CSharp.Examples/Debug/net6.0/linux-x64/publish`)
173+
1. Set the `DOTNET_WORKER_DIR` or `PATH` environment variable to include the path where the `Microsoft.Spark.Worker` binary has been generated (e.g., `~/dotnet.spark/artifacts/bin/Microsoft.Spark.Worker/Debug/net8.0/linux-x64/publish`)
174+
2. Open a terminal and go to the directory where your app binary has been generated (e.g., `~/dotnet.spark/artifacts/bin/Microsoft.Spark.CSharp.Examples/Debug/net8.0/linux-x64/publish`)
172175
3. Running your app follows the basic structure:
173176
```bash
174177
spark-submit \

0 commit comments

Comments
 (0)