Scaling Backup and Restore for Large (20TB–200TB) Databases in SAP ASE

Now we move into real-world scale, where databases are so large that a simple backup strategy will fail due to:
Long backup windows (many hours)
Disk throughput limits
Restore time exceeding acceptable downtime
Massive transaction log growth
This guide is about how to design high-performance, scalable backup systems that work for 20TB–200TB databases.

1. The Reality of Large Databases

Let’s be honest: backing up a 200TB database is not just “running dump database.”

If you try:

dump database bigdb 
to "/backup/bigdb_full.dmp"

You will likely face:

Backup takes 10–20+ hours
System I/O becomes saturated
Application performance degrades
Restore takes even longer

That is unacceptable in a banking system.

2. Key Concept: Backup Parallelism (Striping)

The most important concept for large databases is:
Parallel backups using multiple devices (striping)
Instead of writing to one file, you write to multiple files at the same time.

2.1 Why Striping Works

Each disk has limited throughput
Multiple disks is higher combined throughput
Backup runs faster
Restore also runs faster

2.2 Example: Striped Backup

dump database bigdb
to "/backup1/bigdb_stripe1.dmp"
stripe on "/backup2/bigdb_stripe2.dmp"
stripe on "/backup3/bigdb_stripe3.dmp"
stripe on "/backup4/bigdb_stripe4.dmp"
with compression = 5

This runs 4 parallel streams.

2.3 How Many Stripes Should You Use?

Start simple:
20TB database → 4–8 stripes
50TB database → 8–16 stripes
100TB+ → 16–32 stripes
But don’t go extreme immediately—test gradually.

3. Storage Architecture for Backups

Your backup strategy is only as good as your storage.

3.1 Recommended Layout

Use multiple mount points:

/backup1/
/backup2/
/backup3/
/backup4/

Each should ideally be:

Separate disk / LUN
High throughput
Not shared with database data files

3.2 What NOT to Do

Avoid:
Single disk backups
Backups on same disk as database
Network bottlenecks

4. Designing Backup Strategy for Banking Systems

Now we combine everything.

4.1 Full Backup Strategy (Large DB)

Instead of daily full backups, consider:
Full backup: once every 2 days
Log backups: frequent

4.2 Transaction Log Strategy

For banking systems:
Every 5–15 minutes
Possibly every 1–5 minutes for high-volume systems

4.3 Example Schedule

Full backup: 2:00 AM
Log backups: every 10 minutes
Copy to remote: continuous

5. Advanced Backup Script

5.1 Full Backup with Striping

use master
go

dump database bigdb
to "/backup1/bigdb_full_$(date +%Y%m%d)_1.dmp"
stripe on "/backup2/bigdb_full_$(date +%Y%m%d)_2.dmp"
stripe on "/backup3/bigdb_full_$(date +%Y%m%d)_3.dmp"
stripe on "/backup4/bigdb_full_$(date +%Y%m%d)_4.dmp"
with compression = 5, init
go

5.2 Log Backup with Striping

use master
go

dump transaction bigdb
to "/backup1/bigdb_log_$(date +%Y%m%d_%H%M)_1.dmp"
stripe on "/backup2/bigdb_log_$(date +%Y%m%d_%H%M)_2.dmp"
with compression = 5
go

6. Performance Tuning for Backups

Now we optimize.

6.1 Monitor I/O Performance

Use OS tools:

iostat -x 5

Watch for:

%util near 100 → disk bottleneck
High wait times → slow storage

6.2 Adjust Compression Level

Low compression = faster, larger files
High compression = slower, smaller files

Start with:

with compression = 3

Increase if CPU allows.

6.3 Use Backup Server Configuration

Check:

sp_configure "number of backup connections"

Increase if needed.

7. Restore Strategy for Large Databases

Backing up is only half the job.
Restoring a 100TB database quickly is the real challenge.

7.1 Parallel Restore (Same as Backup)

load database bigdb
from "/backup1/bigdb_full_20260514_1.dmp"
stripe on "/backup2/bigdb_full_20260514_2.dmp"
stripe on "/backup3/bigdb_full_20260514_3.dmp"
stripe on "/backup4/bigdb_full_20260514_4.dmp"
go

7.2 Apply Logs Sequentially

load transaction bigdb 
from "/backup1/bigdb_log_20260514_0210_1.dmp"
go

Repeat for each log backup.

8. Reducing Restore Time (Critical for Banks)

8.1 Keep Backups on Fast Storage

SSD or high-speed SAN
Avoid slow network restores

8.2 Minimize Number of Log Files

Too many small log backups = slow restore.
Balance frequency vs restore time.

8.3 Practice Restore Regularly

At least:
Weekly full restore test
Monthly disaster simulation

9. Automating at Scale

9.1 Shell Script Example

#!/bin/bash

DATE=$(date +%Y%m%d)

isql -Uuser -Ppassword -Sserver <<EOF
dump database bigdb
to "/backup1/bigdb_full_${DATE}_1.dmp"
stripe on "/backup2/bigdb_full_${DATE}_2.dmp"
stripe on "/backup3/bigdb_full_${DATE}_3.dmp"
stripe on "/backup4/bigdb_full_${DATE}_4.dmp"
with compression = 5, init
go
EOF

9.2 Logging Everything

./backup.sh > backup_$(date +%Y%m%d).log 2>&1

10. Free Tools for Large Environments

You don’t need expensive tools to be effective.

10.1 Useful Free Tools

`rsync` → copy backups
`scp` → secure transfer
`cron` → scheduling
`iostat`, `vmstat` → performance monitoring
`grep`, `awk` → log parsing

10.2 Example: Remote Backup Copy

rsync -av --progress /backup1/ backupserver:/remote_backup/

11. Monitoring and Alerts

You must detect failures immediately.

11.1 Simple Alert Script

grep -i error backup.log && mail -s "Backup Failed" dba@bank.com < backup.log

11.2 What to Monitor

Backup duration
File size anomalies
Failed jobs
Disk space

12. Handling Failures

12.1 If Backup Fails

Check logs immediately
Retry backup
Ensure log backups continue

12.2 If Disk Fills Up

Delete old backups (carefully)
Move to remote storage
Increase capacity

13. Best Practices

Always use striping for large DBs
Separate backup storage from data
Keep multiple backup copies
Test restore frequently
Monitor everything
Document procedures

14. Expanded Checklist

Daily:
Full backup completed
Log backups every 5–15 min
Backup size looks correct
Files copied to remote
Weekly:
Test restore completed
Backup performance reviewed
Monthly:
Disaster recovery drill
Storage capacity check

Thursday, May 14, 2026

Scaling Backup and Restore for Large (20TB–200TB) Databases in SAP ASE

Scaling Backup and Restore for Large (20TB–200TB) Databases in SAP ASE

1. The Reality of Large Databases

2. Key Concept: Backup Parallelism (Striping)

The most important concept for large databases is:Parallel backups using multiple devices (striping)Instead of writing to one file, you write to multiple files at the same time.

2.1 Why Striping Works

Each disk has limited throughputMultiple disks is higher combined throughputBackup runs fasterRestore also runs faster

2.2 Example: Striped Backup

dump database bigdb to "/backup1/bigdb_stripe1.dmp" stripe on "/backup2/bigdb_stripe2.dmp" stripe on "/backup3/bigdb_stripe3.dmp" stripe on "/backup4/bigdb_stripe4.dmp" with compression = 5 This runs 4 parallel streams.

2.3 How Many Stripes Should You Use?

Start simple:20TB database → 4–8 stripes50TB database → 8–16 stripes100TB+ → 16–32 stripesBut don’t go extreme immediately—test gradually.

3. Storage Architecture for Backups

Your backup strategy is only as good as your storage.

3.1 Recommended Layout

Use multiple mount points:/backup1/ /backup2/ /backup3/ /backup4/ Each should ideally be:Separate disk / LUNHigh throughputNot shared with database data files

3.2 What NOT to Do

Avoid:Single disk backupsBackups on same disk as databaseNetwork bottlenecks

4. Designing Backup Strategy for Banking Systems

Now we combine everything.

4.1 Full Backup Strategy (Large DB)

Instead of daily full backups, consider:Full backup: once every 2 daysLog backups: frequent

4.2 Transaction Log Strategy

For banking systems:Every 5–15 minutesPossibly every 1–5 minutes for high-volume systems

4.3 Example Schedule

Full backup: 2:00 AMLog backups: every 10 minutesCopy to remote: continuous

5. Advanced Backup Script

5.1 Full Backup with Striping

use master go dump database bigdb to "/backup1/bigdb_full_$(date +%Y%m%d)_1.dmp" stripe on "/backup2/bigdb_full_$(date +%Y%m%d)_2.dmp" stripe on "/backup3/bigdb_full_$(date +%Y%m%d)_3.dmp" stripe on "/backup4/bigdb_full_$(date +%Y%m%d)_4.dmp" with compression = 5, init go

5.2 Log Backup with Striping

use master go dump transaction bigdb to "/backup1/bigdb_log_$(date +%Y%m%d_%H%M)_1.dmp" stripe on "/backup2/bigdb_log_$(date +%Y%m%d_%H%M)_2.dmp" with compression = 5 go

6. Performance Tuning for Backups

Now we optimize.

6.1 Monitor I/O Performance

Use OS tools:iostat -x 5 Watch for:%util near 100 → disk bottleneckHigh wait times → slow storage

6.2 Adjust Compression Level

Low compression = faster, larger filesHigh compression = slower, smaller filesStart with:with compression = 3 Increase if CPU allows.

6.3 Use Backup Server Configuration

Check:sp_configure "number of backup connections" Increase if needed.

7. Restore Strategy for Large Databases

Backing up is only half the job.Restoring a 100TB database quickly is the real challenge.

7.1 Parallel Restore (Same as Backup)

load database bigdb from "/backup1/bigdb_full_20260514_1.dmp" stripe on "/backup2/bigdb_full_20260514_2.dmp" stripe on "/backup3/bigdb_full_20260514_3.dmp" stripe on "/backup4/bigdb_full_20260514_4.dmp" go

7.2 Apply Logs Sequentially

load transaction bigdb from "/backup1/bigdb_log_20260514_0210_1.dmp" go Repeat for each log backup.

8. Reducing Restore Time (Critical for Banks)

8.1 Keep Backups on Fast Storage

SSD or high-speed SANAvoid slow network restores

8.2 Minimize Number of Log Files

Too many small log backups = slow restore.Balance frequency vs restore time.

8.3 Practice Restore Regularly

At least:Weekly full restore testMonthly disaster simulation

9. Automating at Scale

9.1 Shell Script Example

9.2 Logging Everything

./backup.sh > backup_$(date +%Y%m%d).log 2>&1

10. Free Tools for Large Environments

You don’t need expensive tools to be effective.

10.1 Useful Free Tools

rsync → copy backupsscp → secure transfercron → schedulingiostat, vmstat → performance monitoringgrep, awk → log parsing

10.2 Example: Remote Backup Copy

rsync -av --progress /backup1/ backupserver:/remote_backup/

11. Monitoring and Alerts

You must detect failures immediately.

11.1 Simple Alert Script

grep -i error backup.log && mail -s "Backup Failed" dba@bank.com < backup.log

11.2 What to Monitor

Backup durationFile size anomaliesFailed jobsDisk space

12. Handling Failures

12.1 If Backup Fails

Check logs immediatelyRetry backupEnsure log backups continue

12.2 If Disk Fills Up

Delete old backups (carefully)Move to remote storageIncrease capacity

13. Best Practices

Always use striping for large DBsSeparate backup storage from dataKeep multiple backup copiesTest restore frequentlyMonitor everythingDocument procedures

14. Expanded Checklist

Daily:Full backup completedLog backups every 5–15 minBackup size looks correctFiles copied to remoteWeekly:Test restore completedBackup performance reviewedMonthly:Disaster recovery drillStorage capacity check

No comments:

Post a Comment

Scaling Backup and Restore for Large (20TB–200TB) Databases in SAP ASE

The most important concept for large databases is:
Parallel backups using multiple devices (striping)
Instead of writing to one file, you write to multiple files at the same time.

Each disk has limited throughput
Multiple disks is higher combined throughput
Backup runs faster
Restore also runs faster

`dump database bigdb to "/backup1/bigdb_stripe1.dmp" stripe on "/backup2/bigdb_stripe2.dmp" stripe on "/backup3/bigdb_stripe3.dmp" stripe on "/backup4/bigdb_stripe4.dmp" with compression = 5`
This runs 4 parallel streams.

Start simple:
20TB database → 4–8 stripes
50TB database → 8–16 stripes
100TB+ → 16–32 stripes
But don’t go extreme immediately—test gradually.

Use multiple mount points:
`/backup1/ /backup2/ /backup3/ /backup4/`
Each should ideally be:
Separate disk / LUN
High throughput
Not shared with database data files

Avoid:
Single disk backups
Backups on same disk as database
Network bottlenecks

Instead of daily full backups, consider:
Full backup: once every 2 days
Log backups: frequent

For banking systems:
Every 5–15 minutes
Possibly every 1–5 minutes for high-volume systems

Full backup: 2:00 AM
Log backups: every 10 minutes
Copy to remote: continuous

`use master go dump database bigdb to "/backup1/bigdb_full_$(date +%Y%m%d)_1.dmp" stripe on "/backup2/bigdb_full_$(date +%Y%m%d)_2.dmp" stripe on "/backup3/bigdb_full_$(date +%Y%m%d)_3.dmp" stripe on "/backup4/bigdb_full_$(date +%Y%m%d)_4.dmp" with compression = 5, init go`

`use master go dump transaction bigdb to "/backup1/bigdb_log_$(date +%Y%m%d_%H%M)_1.dmp" stripe on "/backup2/bigdb_log_$(date +%Y%m%d_%H%M)_2.dmp" with compression = 5 go`

Use OS tools:
`iostat -x 5`
Watch for:
%util near 100 → disk bottleneck
High wait times → slow storage

Low compression = faster, larger files
High compression = slower, smaller files
Start with:
`with compression = 3`
Increase if CPU allows.

Check:
`sp_configure "number of backup connections"`
Increase if needed.

Backing up is only half the job.
Restoring a 100TB database quickly is the real challenge.

`load database bigdb from "/backup1/bigdb_full_20260514_1.dmp" stripe on "/backup2/bigdb_full_20260514_2.dmp" stripe on "/backup3/bigdb_full_20260514_3.dmp" stripe on "/backup4/bigdb_full_20260514_4.dmp" go`

`load transaction bigdb from "/backup1/bigdb_log_20260514_0210_1.dmp" go`
Repeat for each log backup.

SSD or high-speed SAN
Avoid slow network restores

Too many small log backups = slow restore.
Balance frequency vs restore time.

At least:
Weekly full restore test
Monthly disaster simulation

`./backup.sh > backup_$(date +%Y%m%d).log 2>&1`

`rsync` → copy backups
`scp` → secure transfer
`cron` → scheduling
`iostat`, `vmstat` → performance monitoring
`grep`, `awk` → log parsing

`rsync -av --progress /backup1/ backupserver:/remote_backup/`

`grep -i error backup.log && mail -s "Backup Failed" dba@bank.com < backup.log`

Backup duration
File size anomalies
Failed jobs
Disk space

Check logs immediately
Retry backup
Ensure log backups continue

Delete old backups (carefully)
Move to remote storage
Increase capacity

Always use striping for large DBs
Separate backup storage from data
Keep multiple backup copies
Test restore frequently
Monitor everything
Document procedures

Daily:
Full backup completed
Log backups every 5–15 min
Backup size looks correct
Files copied to remote
Weekly:
Test restore completed
Backup performance reviewed
Monthly:
Disaster recovery drill
Storage capacity check