Skip to content

Commit 20df78c

Browse files
authored
Improve READMEs consistency (#59)
* Updated READMEs for consistency * Improve "Running in IntelliJ" section for Java examples * Better clarify connecting to VPC-networked services
1 parent 34c97c5 commit 20df78c

File tree

15 files changed

+240
-158
lines changed

15 files changed

+240
-158
lines changed

java/CustomMetrics/README.md

Lines changed: 20 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -3,18 +3,32 @@
33
* Flink version: 1.20
44
* Flink API: DataStream API
55
* Language: Java (11)
6+
* Flink connectors: Kinesis Sink
67

78
This example demonstrates how to create your own metrics to track application-specific data, such as processing events or accessing external resources and publish it to Cloudwatch.
89

10+
The applications generates data internally and writes to a Kinesis Stream.
11+
912
### Runtime configuration
1013

11-
The application reads the runtime configuration from the Runtime Properties, when running on Amazon Managed Service for Apache Flink.
14+
When running on Amazon Managed Service for Apache Flink the runtime configuration is read from *Runtime Properties*.
15+
16+
When running locally, the configuration is read from the [`resources/flink-application-properties-dev.json`](src/main/resources/flink-application-properties-dev.json) file located in the resources folder.
17+
18+
Runtime parameters:
19+
20+
| Group ID | Key | Description |
21+
|-----------------|---------------|---------------------------|
22+
| `OutputStream0` | `stream.name` | Name of the output stream |
23+
| `OutputStream0` | `aws.region` | (optional) Region of the output stream. If not specified, it will use the application region or the default region of the AWS profile, when running locally. |
24+
25+
All parameters are case-sensitive.
26+
27+
This simple example assumes the Kinesis Stream is in the same region as the application, or in the default region for the authentication profile, when running locally.
1228

13-
Runtime Properties are expected in the Group ID `OutputStream0`. They are all case-sensitive:
14-
* `stream.name` - Kinesis Data Stream to be used for sink.
15-
* `aws.region` - AWS Region containing test resources.
1629

1730
### Running locally in IntelliJ
18-
Update `PropertyMap` in [configuration file](src/main/resources/flink-application-properties-dev.json).
1931

20-
To start the Flink job in IntelliJ edit the Run/Debug configuration enabling *'Add dependencies with "provided" scope to the classpath'*.
32+
You can run this example directly in IntelliJ, without any local Flink cluster or local Flink installation.
33+
34+
See [Running examples locally](../running-examples-locally.md) for details.

java/GettingStarted/README.md

Lines changed: 15 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -5,32 +5,34 @@ Skeleton project for a basic Flink Java application to run on Amazon Managed Ser
55
* Flink version: 1.20
66
* Flink API: DataStream API
77
* Language: Java (11)
8+
* Flink connectors: Kinesis Consumer, Kinesis Sink
89

910
The project can run both on Amazon Managed Service for Apache Flink, and locally for development.
1011

11-
The application shows how to get runtime configuration, and sets up a Kinesis Data Stream source and a sink.
12+
The application shows how to get runtime configuration, and set up a Kinesis Data Stream source and a sink.
1213

1314
### Runtime configuration
1415

15-
The application reads the runtime configuration from the Runtime Properties, when running on Amazon Managed Service for
16-
Apache Flink, or from command line parameters, when running locally.
16+
When running on Amazon Managed Service for Apache Flink the runtime configuration is read from *Runtime Properties*.
1717

18-
Runtime Properties are expected in the Group ID `FlinkApplicationProperties`.
19-
Command line parameters should be prepended by `--`.
18+
When running locally, the configuration is read from the [`resources/flink-application-properties-dev.json`](resources/flink-application-properties-dev.json) file located in the resources folder.
2019

21-
They are all case-sensitive.
20+
Runtime parameters:
2221

23-
Configuration parameters:
22+
| Group ID | Key | Description |
23+
|-----------------|---------------|---------------------------|
24+
| `InputStream0` | `stream.name` | Name of the input stream |
25+
| `InputStream0` | `aws.region` | (optional) Region of the input stream. If not specified, it will use the application region or the default region of the AWS profile, when running locally. |
26+
| `OutputStream0` | `stream.name` | Name of the output stream |
27+
| `OutputStream0` | `aws.region` | (optional) Region of the output stream. If not specified, it will use the application region or the default region of the AWS profile, when running locally. |
2428

25-
* `InputStreamRegion` region of the input stream (default: `us-east-1`)
26-
* `InputStreamName` name of the input Kinesis Data Stream (default: `ExampleInputStream`)
27-
* `OutputStreamRegion` region of the input stream (default: `us-east-1`)
28-
* `OutputStreamName` name of the input Kinesis Data Stream (default: `ExampleOutputStream`)
29+
All parameters are case-sensitive.
2930

3031
### Running in IntelliJ
3132

32-
To start the Flink job in IntelliJ edit the Run/Debug configuration enabling *'Add dependencies with "provided" scope to
33-
the classpath'*.
33+
You can run this example directly in IntelliJ, without any local Flink cluster or local Flink installation.
34+
35+
See [Running examples locally](../running-examples-locally.md) for details.
3436

3537
### Generating data
3638

java/GettingStartedTable/README.md

Lines changed: 12 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ Example of project for a basic Flink Java application using the Table API & SQL
55
* Flink version: 1.20
66
* Flink API: Table API & SQL, and DataStream API
77
* Language: Java (11)
8+
* Flink connectors: Kinesis Sink
89

910
The project can run both on Amazon Managed Service for Apache Flink, and locally for development.
1011

@@ -16,20 +17,21 @@ control of the generated data. In this example we implemented a `GeneratorFuncti
1617

1718
### Runtime configuration
1819

19-
The application reads the runtime configuration from the Runtime Properties, when running on Amazon Managed Service for
20-
Apache Flink, or from command line parameters, when running locally.
20+
When running on Amazon Managed Service for Apache Flink the runtime configuration is read from *Runtime Properties*.
2121

22-
Runtime Properties are expected in the Group ID `FlinkApplicationProperties`.
22+
When running locally, the configuration is read from the [`resources/flink-application-properties-dev.json`](resources/flink-application-properties-dev.json) file located in the resources folder.
2323

24-
Command line parameters should be prepended by `--`.
24+
Runtime parameters:
2525

26-
Configuration parameters:
26+
| Group ID | Key | Description |
27+
|-----------------|---------------|---------------------------|
28+
| `bucket` | `name` | Name of the destination bucket, **without** the prefix "s3://" |
29+
| `bucket` | `path` | Path withing the bucket the output will be written to, without the trailing "/" |
2730

28-
* `s3Path` <s3-bucket>/<path> of the S3 destination , **without** the prefix `s3://`
29-
30-
They parameters all case-sensitive.
31+
All parameters are case-sensitive.
3132

3233
### Running in IntelliJ
3334

34-
To start the Flink job in IntelliJ edit the Run/Debug configuration enabling *'Add dependencies with "provided" scope to
35-
the classpath'*.
35+
You can run this example directly in IntelliJ, without any local Flink cluster or local Flink installation.
36+
37+
See [Running examples locally](../running-examples-locally.md) for details.

java/KafkaConfigProviders/Kafka-SASL_SSL-ConfigProviders/README.md

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -160,7 +160,11 @@ is controlled via networking (SecurityGroups, NACL) and by the SASL credentials
160160

161161
### Application configuration parameters
162162

163-
The application expects the following Runtime properties:
163+
When running on Amazon Managed Service for Apache Flink the runtime configuration is read from *Runtime Properties*.
164+
165+
When running locally, the configuration is read from the [`resources/flink-application-properties-dev.json`](resources/flink-application-properties-dev.json) file located in the resources folder.
166+
167+
Runtime parameters:
164168

165169
| Group ID | Key | Description |
166170
|-----------|-------------------------------------|--------------------------------------------------------------------------------------------------------|
@@ -172,3 +176,14 @@ The application expects the following Runtime properties:
172176
| `Output0` | `credentials.secret` | Name of the secret (not the ARN) in SecretsManager containing the SASL/SCRAM credentials |
173177
| `Output0` | `credentials.secret.username.field` | Name of the field (the key) of the secret, containing the SASL username. Optional, default: `username` |
174178
| `Output0` | `credentials.secret.password.field` | Name of the field (the key) of the secret, containing the SASL password. Optional, default: `password` |
179+
180+
All parameters are case-sensitive.
181+
182+
## Running locally in IntelliJ
183+
184+
> Due to MSK VPC networking, to run this example on your machine you need to set up network connectivity to the VPC where MSK is deployed, for example with a VPN.
185+
> Setting this connectivity depends on your set up and is out of scope for this example.
186+
187+
You can run this example directly in IntelliJ, without any local Flink cluster or local Flink installation.
188+
189+
See [Running examples locally](../running-examples-locally.md) for details.

java/KafkaConfigProviders/Kafka-mTLS-Keystore-ConfigProviders/README.md

Lines changed: 11 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
* Flink version: 1.19
44
* Flink API: DataStream API
55
* Language: Java (11)
6-
* Connectors: Kafka (mTLS authentication)
6+
* Flink connectors: Kafka (mTLS authentication)
77

88
This sample illustrates how to configure the Flink Kafka connectors (KafkaSource and KafkaSink)
99
retrieving custom KeyStore and TrustStore at runtime, using Config Providers.
@@ -71,10 +71,11 @@ Access Policy/Role associated with the application that is running a config prov
7171

7272
### Runtime configuration
7373

74-
The application reads the runtime configuration from the Runtime Properties, when running on Amazon Managed Service for Apache Flink,
75-
or from `flink-application-properties-dev.json`, when running locally.
74+
When running on Amazon Managed Service for Apache Flink the runtime configuration is read from *Runtime Properties*.
7675

77-
Runtime Properties are expected in the Group ID `Input0` and they are all case-sensitive.
76+
When running locally, the configuration is read from the [`resources/flink-application-properties-dev.json`](resources/flink-application-properties-dev.json) file located in the resources folder.
77+
78+
Runtime parameters:
7879

7980
| GroupId | Key | Default | Description |
8081
|---------|-------------------------|-------------|--------------------------------------------------------------------|
@@ -89,10 +90,13 @@ Runtime Properties are expected in the Group ID `Input0` and they are all case-s
8990
| `Input0` | `keystore.secret` | | SecretManager secret ID containing the password of the keystore |
9091
| `Input0` | `keystore.secret.field` | | SecretManager secret key containing the password of the keystore |
9192

93+
All parameters are case-sensitive.
9294

9395
## Running locally in IntelliJ
9496

95-
To run the application in IntelliJ
97+
> Due to MSK VPC networking, to run this example on your machine you need to set up network connectivity to the VPC where MSK is deployed, for example with a VPN.
98+
> Setting this connectivity depends on your set up and is out of scope for this example.
99+
100+
You can run this example directly in IntelliJ, without any local Flink cluster or local Flink installation.
96101

97-
1. Edit the Run/Debug configuration enabling *'Add dependencies with "provided" scope to the classpath'*
98-
2. Update `flink-application-properties-dev.json` with property values (`bootstrap.servers`, `topic` etc.) that fit your environment.
102+
See [Running examples locally](../running-examples-locally.md) for details.

java/KafkaConnectors/README.md

Lines changed: 28 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -9,35 +9,40 @@ This example demonstrate how to use
99
[Flink Kafka Connector](https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/connectors/datastream/kafka/),
1010
source and sink.
1111

12-
This example uses `KafkaSource` and `KafkaSink`.
12+
This example uses KafkaSource and KafkaSink.
1313

1414
![Flink Example](images/flink-example.png),
1515

16+
> In this example, the Kafka Sink uses *exactly-once* delivery guarantees. This leverages Kafka transaction under the hood, improving guarantees but
17+
> adding some overhead and increasing the effective latency of the output to the consumers of the destination Kafka topic.
18+
>
19+
> Moreover, there are failure scenarios were the Kafka Sink may still cause duplicates, even when set for exactly-once guarantees.
20+
> See [FLIP-319](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=255071710) for more details.
21+
>
22+
> We recommend not to consider Kafka Sink *exactly-once* guarantees as a default setting for all sinks to Kafka.
23+
> Make sure you understand the implications before enabling it. Refer to the [Flink Kafka sink documentation](https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/connectors/datastream/kafka/#fault-tolerance) for details.
24+
1625
Note that the old
1726
[`FlinkKafkaConsumer`](https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/connectors/datastream/kafka/#kafka-sourcefunction)
1827
and [`FlinkKafkaProducer`](https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/connectors/datastream/kafka/#kafka-producer)
1928
were removed in Flink 1.17 and 1.15, respectively.
2029

2130
## Runtime configuration
2231

23-
The application reads the runtime configuration from the Runtime Properties, when running on Amazon Managed Service for Apache Flink,
24-
or from command line parameters, when running locally.
25-
26-
Runtime Properties are expected in the Group IDs `Input0` and `Output0` (see [`resources/flink-application-properties-dev.json`](resources/flink-application-properties-dev.json)).
27-
28-
All properties are case-sensitive.
32+
When running on Amazon Managed Service for Apache Flink the runtime configuration is read from *Runtime Properties*.
2933

30-
Configuration parameters:
34+
When running locally, the configuration is read from the [`resources/flink-application-properties-dev.json`](resources/flink-application-properties-dev.json) file located in the resources folder.
3135

32-
For the source (i.e. Group ID `Input0`):
33-
* `bootstrap.servers` source cluster boostrap servers
34-
* `topic` source topic (default: `source`)
35-
* `group.id` source group id (default: `my-group`)
36+
Runtime parameters:
3637

37-
For the sink (i.e. Group ID `Output0`):
38-
* `bootstrap.servers` sink cluster bootstrap servers
39-
* `topic` sink topic (default: `destination`)
40-
* `transaction.timeout.ms` sink transaction timeout (default: `1000`)
38+
| Group ID | Key | Description |
39+
|-----------|---------------------|-----------------------------------|
40+
| `Input0` | `bootstrap.servers` | Source cluster boostrap servers. |
41+
| `Input0` | `topic` | Source topic (default: `source`). |
42+
| `Input0` | `group.id` | Source group id (default: `my-group`) |
43+
| `Output0` | `bootstrap.servers` | Destination cluster bootstrap servers. |
44+
| `Output0` | `topic` | Destination topic (default: `destination`). |
45+
| `Output0` | `transaction.timeout.ms` | Sink transaction timeout (default: `1000`) |
4146

4247
If you are connecting with no-auth and no SSL, above will work. Else you need additional configuration for both source and sink.
4348

@@ -54,9 +59,10 @@ When using IAM Auth, the following Runtime Properties are expected at the Group
5459

5560
## Running locally in IntelliJ
5661

57-
To run this example locally -
58-
* Run a Kafka cluster locally. You can refer https://kafka.apache.org/quickstart to download and start Kafka locally.
59-
* Create `source` and `sink` topics.
60-
* To start the Flink job in IntelliJ edit the Run/Debug configuration enabling *'Add dependencies with "provided" scope to the classpath'*.
61-
* Update [`resources/flink-application-properties-dev.json`](resources/flink-application-properties-dev.json)
62-
* Execute using credentials with permissions to consume data from a Kinesis Data Stream and write data into Amazon S3.
62+
> Due to MSK VPC networking, to run this example on your machine you need to set up network connectivity to the VPC where MSK is deployed, for example with a VPN.
63+
> Alternatively, you can use a local Kafka installation, for example in a container.
64+
> Setting up the connectivity or a local containerized Kafka depends on your set up and is out of scope for this example.
65+
66+
You can run this example directly in IntelliJ, without any local Flink cluster or local Flink installation.
67+
68+
See [Running examples locally](../running-examples-locally.md) for details.

java/KafkaConnectors/src/main/resources/flink-application-properties-dev.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
"PropertyMap": {
55
"bootstrap.servers": "<BootstrapServers>",
66
"topic": "<Topic>",
7-
"group.id": "<S3Path>"
7+
"group.id": "<ConsumerGroupID>"
88
}
99
},
1010
{

0 commit comments

Comments
 (0)