Skip to content

Commit a6692f7

Browse files
APPS-1697 Clarify documentation for restore by timestamp (#372)
* update readme * regen toc * regen toc * Update README.md Co-authored-by: Eugene R. <yrizhkov@aerospike.com> * Update README.md Co-authored-by: Eugene R. <yrizhkov@aerospike.com> * Update README.md Co-authored-by: Eugene R. <yrizhkov@aerospike.com> --------- Co-authored-by: Eugene R. <yrizhkov@aerospike.com>
1 parent 0fd79aa commit a6692f7

File tree

1 file changed

+93
-36
lines changed

1 file changed

+93
-36
lines changed

README.md

Lines changed: 93 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -9,42 +9,43 @@ You can perform full and incremental backups and set different backup policies a
99
There are also several monitoring endpoints to check backup information.
1010

1111
Use the [OpenAPI generation script](./scripts/generate-openapi.sh) to generate an OpenAPI specification for the service.
12-
A pre-built OpenAPI specification is available in Swagger format [here](https://aerospike.github.io/aerospike-backup-service/).
12+
A pre-built OpenAPI specification is available in Swagger
13+
format [here](https://aerospike.github.io/aerospike-backup-service/).
1314

1415
# Table of contents
1516

1617
<!-- toc -->
1718

1819
- [Getting started](#getting-started)
1920
- [User guide](#user-guide)
20-
* [Run](#run)
21-
* [Configuration](#configuration)
22-
+ [Configuration File Format](#configuration-file-format)
23-
+ [Configuration with API](#configuration-with-api)
24-
* [Monitoring](#monitoring)
25-
* [Example requests and responses](#example-requests-and-responses)
26-
+ [Backup](#backup)
27-
+ [Restore](#restore)
21+
* [Run](#run)
22+
* [Configuration](#configuration)
23+
+ [Configuration File Format](#configuration-file-format)
24+
+ [Configuration with API](#configuration-with-api)
25+
* [Monitoring](#monitoring)
26+
* [Example requests and responses](#example-requests-and-responses)
27+
+ [Backup](#backup)
28+
+ [Restore](#restore)
2829
- [FAQ](#faq)
29-
* [What happens when a backup doesn’t finish before another starts (for the same routine)?](#what-happens-when-a-backup-doesnt-finish-before-another-starts-for-the-same-routine)
30-
* [Can multiple backup routines be performed simultaneously?](#can-multiple-backup-routines-be-performed-simultaneously)
31-
* [How does the backup service identify what data to back up during incremental backups?](#how-does-the-backup-service-identify-what-data-to-back-up-during-incremental-backups)
32-
* [Which storage providers are supported?](#which-storage-providers-are-supported)
30+
* [What happens when a backup doesn’t finish before another starts (for the same routine)?](#what-happens-when-a-backup-doesnt-finish-before-another-starts-for-the-same-routine)
31+
* [Can multiple backup routines be performed simultaneously?](#can-multiple-backup-routines-be-performed-simultaneously)
32+
* [How does the backup service identify what data to back up during incremental backups?](#how-does-the-backup-service-identify-what-data-to-back-up-during-incremental-backups)
33+
* [Which storage providers are supported?](#which-storage-providers-are-supported)
3334
- [Build from source](#build-from-source)
3435
+ [Prerequisites](#prerequisites)
3536
+ [Build the service](#build-the-service)
3637
+ [Build Docker image](#build-docker-image)
3738
+ [Build Linux packages](#build-linux-packages)
3839
+ [Release](#release)
3940
- [Migration Guide](#migration-guide)
40-
* [v3 -> v3.1](#v3---v31)
41-
* [v2 -> v3](#v2---v3)
41+
* [v3 -> v3.1](#v3---v31)
42+
* [v2 -> v3](#v2---v3)
4243

4344
<!-- tocstop -->
4445

4546
# Getting started
4647

47-
Aerospike Backup Service reads configurations from a YAML file that is provided when the service is launched.
48+
Aerospike Backup Service reads configurations from a YAML file that is provided when the service is launched.
4849
See [Run](#run) for specific syntax.
4950

5051
Linux installation packages are available
@@ -113,7 +114,8 @@ docker run -d -p 8080:8080 -v config.yml:/app/config.yml --name backup-service b
113114

114115
#### Service
115116

116-
Run as a service. The default path for the configuration file is `/etc/aerospike-backup-service/aerospike-backup-service.yml`.
117+
Run as a service. The default path for the configuration file is
118+
`/etc/aerospike-backup-service/aerospike-backup-service.yml`.
117119

118120
```bash
119121
sudo systemctl start aerospike-backup-service
@@ -214,7 +216,8 @@ However, backup processes already in progress will continue using the configurat
214216

215217
#### Cluster connection
216218

217-
Cluster configuration entities denote the configuration properties needed to establish connections to Aerospike clusters.
219+
Cluster configuration entities denote the configuration properties needed to establish connections to Aerospike
220+
clusters.
218221
These connections include the cluster IP address, port number, authentication information, and more.
219222
See [`POST: /config/clusters`](https://aerospike.github.io/aerospike-backup-service/#/Configuration/addCluster) for the
220223
full specification.
@@ -227,7 +230,7 @@ including secrets in your configuration.
227230
This entity includes properties of connections to local or cloud storage, where the backup files are stored.
228231
You can get information about a specific configured storage option, such as checking the cloud storage location for
229232
a backup.
230-
You can also add, update, or remove a storage configuration.
233+
You can also add, update, or remove a storage configuration.
231234
See the [Storage](https://aerospike.github.io/aerospike-backup-service/#/Configuration/readAllStorage) entities
232235
under `/config/storage` for detailed information.
233236

@@ -264,6 +267,7 @@ The service exposes a wide variety of system metrics that [Prometheus](https://p
264267
following application metrics:
265268

266269
<!-- Metrics -->
270+
267271
| Name | Description |
268272
|--------------------------------------------------------|---------------------------------------------|
269273
| `aerospike_backup_service_runs_total` | Successful backup runs counter |
@@ -489,14 +493,56 @@ The response is a job ID.
489493

490494
#### Restore using routine name and timestamp
491495

492-
This option restores the most recent full backup for the given timestamp and then applies all subsequent incremental
493-
backups up to that timestamp. You don't need to specify the exact backup path or storage.
496+
This option automatically restores data by identifying and applying the
497+
appropriate backup sequence based on the specified timestamp.
498+
For each namespace defined in the backup routine, the system locates the most recent full backup
499+
prior to the given time and applies all incremental backups created after that full backup,
500+
up to the target timestamp.
501+
502+
There is no need to specify individual backup paths or storage locations — the system handles this internally. The
503+
restore process requires a full backup as a foundation; incremental backups cannot be used on their own.
504+
505+
By default, backups are applied in chronological order. However, when restoring to an empty namespace, the system may
506+
reverse the order of application and use the `CREATE_ONLY` policy. This optimization ensures that each record is written
507+
exactly once—applying only the latest version—thus reducing write load and generation noise. If needed, this
508+
optimization can be disabled using the `disable-reordering` flag in the `RestoreTimestampRequest`.
509+
510+
Overall, the process is fully automated: users do not need to manually choose or arrange backups for the restore to
511+
succeed. The restore process runs in parallel for every namespace.
512+
513+
**Example**
514+
515+
```text
516+
Timeline ─────────────────────────────────────────────────────────────────────────────────────────▶
517+
518+
Backups:
519+
[Full A]──[Incr A1]──[Incr A2]──[Full B]──[Incr B1]──[Incr B2]──▶ T ◀──[Incr B3]──[Full C]──...
520+
521+
Restore Point
522+
```
523+
524+
What gets restored at T2:
525+
526+
* Full backup: `Full B`
527+
* Incremental backups: `Incr B1`, `Incr B2`
528+
* Excluded: `Incr B3` and anything after T2
529+
530+
Restore order (to empty namespace): `Incr B2`, `Incr B1`, `Full B`.
531+
532+
- Backups are applied in reverse order. This ensures that the most recent version of each record is restored first. Any
533+
earlier versions of the same record are skipped, by using `CREATE_ONLY` policy, reducing unnecessary writes.
534+
535+
Restore order (to non-empty namespace or with `disable-reordering`): `Full B`, `Incr B1`, `Incr B2`.
536+
537+
* Backups are applied in chronological order.
538+
All versions of each record are restored step by step.
539+
If a record was modified multiple times, each update is applied, with the final version appearing last.
494540

495541
[`
496542
POST {{baseUrl}}/v1/restore/timestamp`](https://aerospike.github.io/aerospike-backup-service/#/Restore/restoreTimestamp)
497543

498544
<details>
499-
<summary>Request</summary>
545+
<summary>Request body</summary>
500546

501547
<!-- RestoreTimestampRequest -->
502548

@@ -521,7 +567,7 @@ endpoint
521567
`GET {{baseUrl}}/v1/restore/status/<jobId>`](https://aerospike.github.io/aerospike-backup-service/#/Restore/restoreStatus).
522568

523569
<details>
524-
<summary>Request</summary>
570+
<summary>Request body</summary>
525571

526572

527573
<!-- CurrentBackupResponse -->
@@ -579,21 +625,32 @@ To manage resource utilization, you can configure the `cluster.max-parallel-scan
579625
threads operating on a single cluster.
580626

581627
## How does the backup service identify what data to back up during incremental backups?
582-
The Aerospike Backup Service uses Aerospike’s scan operation to identify and backup records,
583-
with different behaviors for full and incremental backups:
584-
* **Full Backups:**
585-
* Capture all records in the specified namespaces/sets without any time filter.
586-
The service uses a scan operation with no lower time boundary (modAfter = 0).
587-
588-
* **Incremental Backups:**:
589-
* Only capture records that have been modified since the last successful backup (full or incremental). The service tracks the timestamp of the last backup in a metadata YAML file stored alongside the backup data. This timestamp becomes the lower time boundary (modAfter parameter) for the next incremental backup.
590-
For the upper time boundary (modBefore), two approaches are available:
591628

592-
- **Default Behavior (Open-ended)**: No upper time boundary is set. This means records modified during the backup process itself might be included in the backup, but with unpredictable results. For example, if a backup starts at 12:00 and runs for 5 minutes, a record created at 12:01 might be included with either its new or old version—there’s no guarantee which state will be captured.
593-
- **Sealed Backups**: When the sealed property in the backup policy is set to true, the backup service will only include records modified before the backup start time. While this creates a more precise point-in-time snapshot, there’s still unpredictability: if a record is updated during the backup process, it might be captured in its old state or excluded entirely from the backup.
594-
595-
Users should select the appropriate approach based on their recovery point objectives and consistency requirements. The default open-ended approach ensures better data coverage but with some state unpredictability, while sealed backups provide better point-in-time consistency but might miss records updated during the backup process.
629+
The Aerospike Backup Service uses Aerospike’s scan operation to identify and backup records,
630+
with different behaviors for full and incremental backups:
596631

632+
* **Full Backups:**
633+
* Capture all records in the specified namespaces/sets without any time filter.
634+
The service uses a scan operation with no lower time boundary (modAfter = 0).
635+
636+
* **Incremental Backups:**:
637+
* Only capture records that have been modified since the last successful backup (full or incremental). The service
638+
tracks the timestamp of the last backup in a metadata YAML file stored alongside the backup data. This timestamp
639+
becomes the lower time boundary (modAfter parameter) for the next incremental backup.
640+
For the upper time boundary (modBefore), two approaches are available:
641+
642+
- **Default Behavior (Open-ended)**: No upper time boundary is set. This means records modified during the
643+
backup process itself might be included in the backup, but with unpredictable results. For example, if a
644+
backup starts at 12:00 and runs for 5 minutes, a record created at 12:01 might be included with either its new
645+
or old version—there’s no guarantee which state will be captured.
646+
- **Sealed Backups**: When the sealed property in the backup policy is set to true, the backup service will only
647+
include records modified before the backup start time. While this creates a more precise point-in-time
648+
snapshot, there’s still unpredictability: if a record is updated during the backup process, it might be
649+
captured in its old state or excluded entirely from the backup.
650+
651+
Users should select the appropriate approach based on their recovery point objectives and consistency requirements. The
652+
default open-ended approach ensures better data coverage but with some state unpredictability, while sealed backups
653+
provide better point-in-time consistency but might miss records updated during the backup process.
597654

598655
## Which storage providers are supported?
599656

0 commit comments

Comments
 (0)