Thursday, May 7, 2026

Runbook for Production Migration from OnPrem SQL SERVER to SQL SERVER AWS EC2 with Minute-by-Minute Timing

 Runbook for Production Migration from OnPrem SQL SERVER to SQL SERVER AWS EC2 with Minute-by-Minute Timing


Scenario Assumptions


* Source: On-prem SQL Server


* Target: SQL Server on Amazon EC2


* Method: Backup + Restore + Log Shipping


* Downtime target: 15–30 minutes


* Database size: 10TB–100TB


* Migration window: Weekend (example: Saturday night)




BIG PICTURE FLOW


* Days before → prepare and sync


* Final hour → freeze + final logs


* Minutes → cutover


* After → validate + stabilize



T-7 DAYS (ONE WEEK BEFORE)


Goal: Eliminate surprises


Tasks:


* Full inventory completed


* Test full backup + restore on EC2


* Measure restore time


* Test log shipping loop


* Validate application connectivity


Must Confirm:


* You can restore DB successfully on EC2

* Log shipping works continuously

* Performance is acceptable


T-1 DAY (FINAL PREPARATION)



Goal: Be 100% ready



Tasks:


* Notify stakeholders


* Confirm downtime window


* Freeze schema changes


* Verify backups


Run Final Pre-Check Script


```sql 


SELECT name, state_desc FROM sys.databases;

DBCC CHECKDB('YourDB') WITH NO_INFOMSGS;

```



T-6 HOURS (START INITIAL LOAD)



Goal: Get most data to EC2 before downtime



Step 1: Take Full Backup


```sql 


BACKUP DATABASE YourDB

TO DISK = 'G:\Backup\YourDB_1.bak',

   DISK = 'G:\Backup\YourDB_2.bak',

   DISK = 'G:\Backup\YourDB_3.bak'

WITH COMPRESSION, STATS = 5;

```



Step 2: Transfer Backup


```powershell 


robocopy G:\Backup \\EC2-SQL\Backup *.bak /MT:16

```



Step 3: Restore on EC2 (NORECOVERY)


```sql 


RESTORE DATABASE YourDB

FROM DISK = 'G:\Backup\YourDB_1.bak',

     DISK = 'G:\Backup\YourDB_2.bak',

     DISK = 'G:\Backup\YourDB_3.bak'

WITH NORECOVERY;

```



T-4 HOURS


Goal: Start continuous sync



Step: Start Log Shipping Loop


Every 5 minutes:


```sql 


BACKUP LOG YourDB TO DISK = 'G:\Backup\log.trn';

```


Copy + restore automatically.



Monitor:


* Logs applying correctly


* No gaps


* No errors


T-60 MINUTES (CRITICAL PHASE STARTS)



Goal: Prepare for cutover



Tasks:


* Notify users: “System going read-only soon”


* Ensure log shipping is current


* Validate EC2 database



Validation Script


```sql 


SELECT COUNT(*) FROM CriticalTable;

```


Compare with source.



T-30 MINUTES



Goal: Reduce final sync gap



Change Log Backup Frequency


* From every 5 min → every 1 min



Confirm:


* No backlog of logs


* EC2 nearly in sync



T-15 MINUTES (POINT OF NO RETURN NEAR)



Goal: Prepare final cutover



Tasks:


* Stop batch jobs


* Pause ETL processes


* Notify: “Final downtime starting”



T-10 MINUTES (WRITE FREEZE)



Goal: Stop all writes



Step 1: Disable Application Writes


Options:


* Stop application services


* Or set DB to read-only


---


```sql 


ALTER DATABASE YourDB SET READ_ONLY;

```



Step 2: Verify No Active Transactions


```sql


SELECT * FROM sys.dm_tran_active_transactions;

```


T-5 MINUTES (FINAL SYNC)



Goal: Capture last changes



Step 1: Final Log Backup


```sql 


BACKUP LOG YourDB 

TO DISK = 'G:\Backup\final.trn'

WITH NORECOVERY;

```



Step 2: Copy Final Log


```powershell 


robocopy G:\Backup \\EC2-SQL\Backup final.trn

```



Step 3: Apply Final Log


```sql 


RESTORE LOG YourDB

FROM DISK = 'G:\Backup\final.trn'

WITH RECOVERY;

```



T-0 (CUTOVER MOMENT)


THIS IS THE SWITCH



Step 1: Bring EC2 Database Online


Already done with `WITH RECOVERY`




Step 2: Update Application Connection


Change:


* Connection string


* DNS alias (preferred)



Step 3: Enable Application


* Start services


* Allow connections



T+5 MINUTES


Goal: Smoke Test



Test:


* Login works


* Key queries run


* Application loads



SQL Test


```sql 


SELECT TOP 10 * FROM CriticalTable;

```



T+15 MINUTES


Goal: Validate system health



Run:


```sql 


DBCC CHECKDB('YourDB') WITH NO_INFOMSGS;

```


Check:


* CPU usage


* Disk latency


* Errors



T+30 MINUTES



Goal: Confirm success




Tasks


* Business validation (very important)


* Confirm no missing data


* Monitor performance



T+1 HOUR



Goal: Stabilization



Tasks


* Enable SQL Agent jobs


* Resume ETL


* Monitor logs



T+24 HOURS



Goal: Final confirmation



Tasks:


* Full backup on EC2


* Review performance


* Confirm no issues



ROLLBACK PLAN (MUST EXIST)


If something fails at T+5 or T+15:


Step 1: Redirect Application Back


* Point to on-prem SQL Server



Step 2: Re-enable Writes


```sql 


ALTER DATABASE YourDB SET READ_WRITE;

```



Step 3: Investigate


---



REAL-WORLD TIMING SUMMARY



Bulk Data Movement:


* Happens hours before (T-6h)



Sync Phase:


* Continuous (T-4h → T-10m)



Downtime Window:


Only:T-10 minutes → T+5 minutes



PRO TIPS 


Tip 1: Always Use DNS


Instead of changing app config: Point DNS to EC2



Tip 2: Practice Twice


* First run → learning


* Second run → timing



Tip 3: Record Everything


* Start time


* End time


* Issues



Tip 4: Have War Room


* DBA


* App team


* Network team


---



REALITY Check:


A successful migration is not about just moving data rather it is Timing,  Coordination, Recovery readiness


No comments:

Post a Comment

Runbook for Production Migration from OnPrem SQL SERVER to SQL SERVER AWS EC2 with Minute-by-Minute Timing

  Runbook for  Production Migration from OnPrem SQL SERVER to SQL SERVER AWS EC2 with Minute-by-Minute Timing Scenario Assumptions * Source:...