Execution Phase: Migrating from On-Prem Oracle to MongoDB on Azure VM IN THE CASE OF BANKING INDUSTY
DURING Migration— Tailored for US Banking Sector
In this part, we move from planning to actual execution.
Because this is tailored for banking systems, we must follow stricter rules:
Zero or near-zero data loss
High data accuracy (financial integrity)
Auditability (every step traceable)
Security (PII + financial data protection)
Minimal downtime
1. Banking Migration Strategy-Reality Check
Let’s be very clear:
You should NOT migrate a 10TB–100TB banking database in one shot.
Instead, use:
Phased Migration Approach (Best Practice)
Historical Data Migration (old data)
Incremental Migration (recent changes)
Final Cutover (switch system)
2. Data Classification-Critical phase
Before migrating, classify data:
2.1 Types of Banking Data
Customer data (PII)
Transactions (very sensitive)
Accounts (balances must be exact)
Audit logs (must never be lost)
2.2 Migration Priority
Audit logs (immutable)
Historical transactions
Customer profiles
Real-time transactions (last step)
3. Secure Data Extraction from Oracle
3.1 Use Oracle Data Pump for Large Data
For banking-scale data, Data Pump is required.
expdp system/password DIRECTORY=DATA_PUMP_DIR \
DUMPFILE=banking_%U.dmp \
LOGFILE=export.log \
PARALLEL=8 \
FILESIZE=50G
Why this matters:
Parallel export (faster)
Split files (safe recovery)
3.2 For Near Real-Time Systems (Important)
You must capture ongoing changes.
Use:
Oracle LogMiner (free)
Or incremental queries
Example incremental query:
SELECT * FROM transactions
WHERE last_updated > TO_DATE('2026-01-01','YYYY-MM-DD');
4. Data Transformation (Relational → MongoDB)
This is the hardest part in banking systems.
4.1 Example: Banking Transaction Table
Oracle:
TRANSACTIONS
- txn_id
- account_id
- amount
- currency
- txn_date
MongoDB Design (Recommended)
{
"account_id": 123,
"transactions": [
{
"txn_id": 1,
"amount": 500,
"currency": "USD",
"txn_date": "2026-01-01"
}
]
}
But for very large banking systems, DO NOT embed everything.
Use hybrid model:
Accounts collection
Transactions collection
5. Data Masking (Mandatory for Banking Security)
Before moving data:
Mask sensitive fields
5.1 Example Masking Query
SELECT
account_id,
'XXXX-XXXX-XXXX-' || SUBSTR(card_number, -4) AS card_masked,
balance
FROM accounts;
This protects:
Credit card data
Personally identifiable information (PII)
6. Convert Data to JSON (Automation Required)
6.1 Python Script for Banking Data
import cx_Oracle
import json
connection = cx_Oracle.connect("user/password@host:port/db")
cursor = connection.cursor()
cursor.execute("SELECT * FROM transactions")
columns = [col[0] for col in cursor.description]
data = []
for row in cursor:
record = dict(zip(columns, row))
data.append(record)
with open("transactions.json", "w") as f:
json.dump(data, f)
For large systems:
Do NOT load all data in memory
Process in chunks
6.2 Chunk-Based Processing (Very Important)
fetch_size = 10000
while True:
rows = cursor.fetchmany(fetch_size)
if not rows:
break
# process and write to file
7. Load Data into MongoDB (Core Migration Step)
7.1 Using mongoimport (Free Tool)
mongoimport \
--host rs0/primary:27017 \
--db banking \
--collection transactions \
--file transactions.json \
--numInsertionWorkers 8 \
--batchSize 1000
Key flags:
--numInsertionWorkers→ parallel insert--batchSize→ control memory
8. Parallel Migration (Required for 10TB–100TB)
8.1 Split Data by Range
Example:
transactions_1.json → 0–1M
transactions_2.json → 1M–2M
8.2 Run Parallel Jobs
mongoimport --file transactions_1.json &
mongoimport --file transactions_2.json &
mongoimport --file transactions_3.json &
This reduces migration time from:
Weeks → Days
9. Data Validation (CRITICAL FOR BANKING)
You must prove:
No data loss
No data corruption
9.1 Row Count Validation
Oracle:
SELECT COUNT(*) FROM transactions;
MongoDB:
db.transactions.countDocuments()
9.2 Financial Validation (VERY IMPORTANT)
Check totals, for example
Oracle:
SELECT SUM(amount) FROM transactions;
MongoDB:
db.transactions.aggregate([
{ $group: { _id: null, total: { $sum: "$amount" } } }
])
If totals don’t match → STOP migration immediately
10. Incremental Migration (Change Data Capture)
10.1 Strategy
After initial load:
Capture only new changes
Apply to MongoDB
10.2 Simple Incremental Script
SELECT * FROM transactions
WHERE last_updated > :last_migration_time;
Run every:
5 minutes (real-time)
1 hour (batch)
11. Error Handling & Retry Logic
11.1 Common Errors
Duplicate keys
Network failure
Disk full
Timeout
11.2 Retry Script Example
#!/bin/bash
for file in *.json
do
mongoimport --file $file || echo "Retrying $file"
done
12. Security During Migration (Banking Grade)
12.1 Encrypt Data in Transit
Use TLS:
mongoimport --ssl --sslCAFile ca.pem
12.2 Secure MongoDB Access
Enable authentication:
use admin
db.createUser({
user: "admin",
pwd: "securepassword",
roles: ["root"]
})
13. Minimal Downtime Cutover Strategy
13.1 Final Sync Steps
Stop application writes to Oracle
Run final incremental load
Validate data
Switch application to MongoDB
Downtime target:
Less than 30 minutes (best practice)
14. Real Banking Failure Scenario (Very Important)
Scenario:
During migration, 5% of transactions are missing
Root Cause:
Incremental sync failed
Solution:
Re-run incremental query
Compare missing IDs
Reload only missing records
15. Audit Logging (Mandatory for Banking)
15.1 Log Every Action
Example:
mongoimport --file transactions.json >> migration.log 2>&1
No comments:
Post a Comment