Data Migration – Oracle to Redshift using AWS S3 Stage

Introduction

Are you trying to migrate Oracle to Redshift? Have you looked all over the internet to find a solution for it? If yes, then you are in the right place. Redshift is a fully managed Data Warehouse, whereas Oracle is a modern Database Management System.

This article will give you a brief overview of migrating data from Oracle Database and Redshift via S3 stage.

What is S3 stage, why do we need here:

S3 Bucket – The staging area for Redshift. This might be something created solely for use by Redshift, but it is often already an integral part of a company’s greater data repository landscape.

Prerequisite

We need following active accounts for Oracle to Redshift migration

  • Oracle Database
  • Redshift
  • AWS S3 (Optionally you can use uArrow S3 )

To know more about Oracle Database, visit this link.

To know more about Redshift, visit this link.

To know more about AWS S3, visit this link

Create Connections
1. Oracle connection
uArrow has an in-built Oracle Integration that connects to your oracle database within few seconds.

1.1. Click Connection menu from top to view (SQL DATABASE & CLOUD WAREHOUSE, CLOUD STORAGE, etc.) adapters

1.2. Click Oracle button to create Oracle database connection

1.3. Provide below connection parameters in the connection creation form

Parameter Name Description
Connection name Specify the name of the source connection.
Host Enter the name of machine where the Oracle Server instance is located, it should be Computer name, fully qualified domain name, or IP address
Port Enter the port number to connect to this Oracle Server. Four digit integer, Default: 1521
Database Enter an existing Oracle connection through which the uArrow accesses sources data to migrate.
Schema Enter an existing Oracle database schema name.
User Enter the user name of the oracle database, The user name to use for authentication on the Oracle database
Password Enter the user’s password. The password to use for authentication on the Oracle database

1.4. After connection details, validate connection to verify
![Docusaurus](/img/uarrow/conn/dm_conn_oracle_create_2.png)
1.5. Save Connection – Don’t forget to save connection after connection validation success.

AWS S3 connection

uArrow has an in-built AWS S3 Integration that connects to your S3 within few seconds.

2.1. Click Connection menu from top to view (SQL DATABASE & CLOUD WAREHOUSE, CLOUD STORAGE, etc.) adapters

2.2. Click AWS S3 button to create Oracle database connection

2.3. Provide below connection parameters in the connection creation form

Parameter Name Description
Connection name Specify the name of the stage connection.
Bucket Name Specify your AWS S3 bucket name.
Access Key Specify the access key for your Amazon Web Services account.
Secret Key Specify the secret key for your Amazon Web Services account.
Region The region where your bucket should be located. For example: us-east-1
S3 URI Specify your AWS S3 bucket URI.

2.4. After connection details, validate connection to verify
![Docusaurus](/img/uarrow/conn/dm_conn_oracle_create_2.png)

2.5. Save Connection – Don’t forget to save connection after connection validation success.

3. Snowflake connection#
uArrow has an in-built Oracle Integration that connects to your oracle database within few seconds.

3.1. Click Connection menu from top to view (SQL DATABASE & CLOUD WAREHOUSE, CLOUD STORAGE, etc.) adapters

3.2. Click Redshift button to create Redshift warehouse connection

3.3. Provide below connection parameters in the connection creation form

Parameter Name Description
Connection name Specify the name of the target connection.
Host Specifies the hostname / IP address of the snowflake warehouse.
Port Specifies the port number of the snowflake warehouse, default port 443
User Snowflake warehouse login name of the user for the connection.
Password Specifies the password for the specified user
db The specified database should be an existing database for which the specified default role has privileges.
schema The specified schema should be an existing schema for which the specified default role has privileges
warehouse The specified warehouse should be an existing warehouse for which the specified default role has privileges.
role The specified role should be an existing role that has already been assigned to the specified user for the driver. If the specified role has not already been assigned to the user, the role is not used when the session is initiated by the driver

3.4. After connection details, validate connection to verify
![Docusaurus](/img/uarrow/conn/dm_conn_oracle_create_2.png)
3.5. Save Connection – Don’t forget to save connection after connection validation success.

Create Job
After creating connections, you are ready to create data migration job.

Click Data Migration menu from top to create data migration job.

You can able to see below screen, there you can click again create link to create new job.

Note: if you are already created any job you can use + button to create new data migration job.

Job Creation screen has three phases

Source – specify source database details
Mapping – Verify automated mapping
Target – specify target warehouse details

1. Source
Define source details

Specify below details to define source
| Parameter Name | Description |
| ------------------------------- | ------------------------------------------------------------ |
| Name | Specify the new name for data migration job. |
| Description | Specify detail description if required. |
| Mapping Creation | Mapping can be created in two ways
1. Using System - You can use uArrow screen to select / de-select the mapping details
or
2. Using Existing File - If you have prepared mapping's in csv file, you can import here
In this article we are using Option 1. Using System
|
| Source Connection | Select source database connection which you have already created, if not find any source connection then create connection. |
| Source Stage Connection | This is stage connection, stage connection can be AWS S3, Azure BLOB Storage or Google Cloud Storage which you have created already in above steps
In this article we are using AWS S3 as Source Stage
Note: Based on your target connection location you can decide **Source Stage**.
Example: Redshift warehouse sits in AWS, then use AWS S3 as **Source Stage**
or my Redshift warehouse sits in Azure platform then recommeded to use Azure BLOB storage |
| Choose Source Stage File's Path | Select your existing AWS S3 bucket folder or create new folder path |
After specifying source details, **Save** the connection and click **Next** for mapping defining screen
![Docusaurus](/img/uarrow/job/dm_create_job_3.png)

2. Mapping
uArrow generates automated mapping based on source schema metadata.

Below are the features being available to enhance your mapping for data migration

Feature Name Description
Required To exclude relatable table /data sets you can de-select the required flag.
Create Table This feature helps to create new table in target warehouse, if de-selected then existing target table re-used to migrate
Clean Load Enabled: To delete/truncate the target table data and load source table data, this flag disabled if Create Table option enabled
Disabled: No delete/truncate performed, just inserted the target table data from source table data
Capture Bad Rows This feature helps to capture bad rows (failed rows) in seperate bad rows table for future analysis
Follow Target Schema This feature enabled then source table data loading based on target table data type / schema, this option disabled if Create Table option enabled.
Source Filter This feature helps to filter out data using ANSII SQL from source table / data set

After verifying mapping definitions, please click Apply button to save mapping changes and click Next buttong to define Target definitions

3. Target
Define target details

Parameter Name Description
Target Stage Connection Select existing defined target connection, Here we need create, truncate/delete, insert, select table permission, this connection mainly captures bad rows and act as stage area for target connection
No. Target Load Default Value : 1, You can increase if you want load to multiple targets from same source.
Target Connection 1 Select existing defined target connection to migrate source data. Here you need minimum insert, delete, select permission if Create Table feature enabled, else create, insert, delete, select permission required.

After defining target connection details, click Save button to save data migration job

4. Schedule / Ad-hoc Run
After saving the data migration job, you can run job (ad-hoc run) or you can schedule job if required.

Adhoc Run: You can use Run icon/button to run adhoc run, after run it will take you to job log dashboard screen.

Schedule: You can use Schedule icon/button to schedule existing job.

Monitoring Job

Congratulations! You have created new job for oracle to redshift.

After successful job execution you can able to see job progress and lineage info.

Note: You can able to view job summary stats, table level detail, failed rows details for completed job.

1. Navigate to Job Log
Click Job Log menu from top to check job logs.

2. Summary Stats
Click on existing Job Status link/button in above screen to view job log dashboard, this will take you to Data Migration Job Log Summary screen

3. Detail Stats

Data Migration Table Level Status: Here you can see failed details rows clicking on Failed Rows count if any.

4. Lineage

Questions? Feedback?

Did this article help? If you have questions or feedback, feel free to contact us

Menu