Skip to main content
Version: v2.7 print this page

DWH Management

The Data Warehouse Cluster Management feature in Amorphic is a tool that helps you manage and monitor your data warehouse. It allows you to see important information about your data warehouse, such as how is it performing and what type of data is it storing.

  • You can use it to perform actions on your data warehouse, like creating backups and restarting it.

  • A simplified view of some of the key cluster metrics Ex: CPU Utilization, DB Connection Counts, Disk Space etc.

  • Facilitates cluster operations like creating snapshot, update maintenance and reboot etc.

  • It also provides additional capabilities for redshift type DWH such as:

    - running administrative operations on tables
    - scheduling actions on the cluster
    - enabling or disabling users to list schemas and tables within a tenant

    For example, the feature allows the administrator to prevent users from seeing all the tables or schemas in a database, but they can still access the data by running queries. This is useful in situations where certain information needs to be restricted. By default, this option is enabled for all users in a non-multi tenancy environment and enabled for authorized users in a multi-tenancy environment.

  • It also provides information about the cluster node types and security configurations.

Overall, this feature makes it easy to manage and keep track of your data warehouse.

DWH Management Home Page

Redshift Serverless

Amorphic DWH management extends its support for Redshift Serverless as well.

With Amazon Redshift Serverless,

  • Namespaces and workgroups are used to manage storage and compute resources separately. This is equivalent to cluster identifier in Redshift provisioned server.
  • Node types or node count are not applicable.
  • Pause/ Resume/ Restart are not applicable here as we pay only when queries run.
  • Schedules are not applicable since there is no maintenance window.
  • List of all snapshots being created are displayed in the UI.
  • Recovery points are created every 30 minutes and saved for 24 hours. Latest Recovery point is shown in the UI.
  • WLM Queues are not applicable.
  • List of supported metrics can be found in the below section.
Note
  • Currently, Bulk data load ingestion is not supported

Metrics

The Amorphic DWH management cluster offers the option to access timeline-based cluster metrics, which aids in comprehending and managing the system effectively. You can easily filter these metrics by time range, selecting a duration from 30 minutes to a week, to swiftly access the desired metrics. The default view displays metrics from the last 30 minutes.

Below are the available metrics:

Redshift Provisioned

  • Commit Queue Length
    • Number of transactions waiting to commit at a given point in time.
    • Units: Count
  • Concurrency Scaling Active Clusters
    • Number of concurrency scaling clusters that are actively processing queries at any given time.
    • Units: Count
  • Concurrency Scaling Seconds
    • Number of seconds used by concurrency scaling clusters that have active query processing activity.
    • Units: Count
  • CPU Utilization
    • Shows the percentage of CPU utilization for all nodes.
    • Units: Percent
  • Database Connection Count
    • The number of database connections to a cluster.
    • Units: Count
  • Health Status
    • Indicates the health of the cluster.
    • Units: Count( 1-HEALTHY / 0-UNHEALTHY ).
  • Maintenance Mode
    • Indicates whether the cluster is in maintenance mode.
    • Units: Count( 1-ON / 0-OFF ).
  • Max Configured Concurrency Scaling Clusters
    • Maximum number of concurrency scaling clusters configured from the parameter group.
    • Units: Count
  • Network Receive Throughput
    • The rate at which the node or cluster receives data.
    • Units: Bytes/Second
  • Network Transmit Throughput
    • The rate at which the node or cluster writes data.
    • Units: Bytes/Second
  • Percentage Disk Space Used
    • The percent of disk space used.
    • Units: Percent
  • Read IOPS
    • The average number of disk read operations per second.
    • Units: Count/Second
  • Read Latency
    • The average amount of time taken for disk read I/O operations.
    • Units: Seconds
  • Read Throughput
    • The average number of bytes read from disk per second.
    • Units: Bytes/Second
  • Redshift Managed Storage Total Capacity
    • Total managed storage capacity.
    • Units: Megabytes
  • Total Table Count
    • The number of user tables (excluding Amazon Redshift Spectrum tables) open at a particular point in time.
    • Units: Count
  • Write IOPS
    • The average number of write operations per second.
    • Units: Count/Second
  • Write Latency
    • The average amount of time taken for disk write I/O operations.
    • Units: Seconds
  • Write Throughput
    • The average number of bytes written to disk per second.
    • Units: Bytes/Second
  • Num Exceeded Schema Quotas
    • The number of schemas with exceeded quotas.
    • Units: Count

Redshift Serverless

  • Database Connection Count
    • The number of database connections at a point in time.
    • Units: Count
  • Compute Capacity
    • Average number of compute units allocated during the past 30 minutes.
    • Units: Count
  • Compute Seconds
    • Accumulated compute-unit seconds used in the last 30 minutes.
    • Units: Count
  • Total Table Count
    • Number of user tables (excluding Amazon Redshift Spectrum tables) existing at a point in time.
    • Units: Count
  • Data Storage
    • Number of megabytes used, in disk or storage space, for Redshift data.
    • Units: Megabytes

Operations

Amorphic DWH management cluster console provides a handful of key operations such as:

The following picture depicts the DWH Management Console Actions in Amorphic:

DWH Management cluster actions page

Creating Snapshot

The Management console enables you to generate manual snapshots with a customizable retention period for the cluster. The valid retention period ranges from 1 to 3653 days.

DWH Management Actions Page

Update cluster maintenance window

Enables you to update the scheduled cluster maintenance window. Supported format : ddd:HH:MM-ddd:HH:MM

DWH Management Actions Page

Rebooting Cluster

Enables you to reboot an active cluster.

DWH Management Actions Page

Pause and Resume Cluster

The Amorphic management console permits you to pause and resume their Redshift cluster. Please note that this option is exclusively supported for Redshift type DW clusters. To pause or resume the cluster, simply click on the pause/resume button, as demonstrated in the image below.

DWH Management Actions Page

Pause/Resume Schedule

To create a pause/resume schedule click on the schedules tab and then click on Create Schedule button, in the dwh management section. The supported actions are:

  • Pause Later: Pause the cluster at a given time stamp (at format) or recurring (cron format) schedule
  • Resume Later: Resume the cluster at a given time stamp (at format) or recurring (cron format) schedule
  • Pause and Resume Later: Schedule a pause and resume action at a defined period. Example : Pause on every friday 10:00 PM and resume on monday morning 9:00 AM

DWH Management Actions Page

DWH Management Actions Page

Note
  • Format of at expression is "at(yyyy-mm-ddThh:mm:ss)". For example, "at(2016-03-04T17:27:00)".
  • Format of cron expression is "cron(Minutes Hours Day-of-month Month Day-of-week Year)". For example, "cron(0 10 ? MON )".

Vacuuming

Amorphic enables you to execute a process called "vacuuming" on database tables, which aids in freeing up space and enhancing performance. This process is exclusively available for backend clusters of the Redshift type. Amorphic supports various types of vacuuming, such as:

  • FULL: This is the default option, it reclaims disk space occupied by rows that were marked for deletion by previous UPDATE and DELETE operations.

  • SORT ONLY: This option sorts the specified table without reclaiming space freed by deleted rows.

  • DELETE ONLY: This option reclaims disk space occupied by rows that were marked for deletion by previous UPDATE and DELETE operations, and compacts the table to free up the consumed space. It doesn't sort table data.

  • REINDEX: This option analyzes the distribution of the values in interleaved sort key columns, then performs a full VACUUM operation. It is not supported with TO threshold PERCENT.

Note

Threshold Percent is a clause that specifies the threshold value, above which VACUUM skips the sort phase and the target threshold for reclaiming space in the delete phase.

DWH Management Home Page

Administrator Queries

Enables you to run administrator level queries on the database cluster. Amorphic supports the following actions

  • Query System Tables : Fetch table load, Work load management, S3 client errors and list table locks. Amorphic provides advanced querying options to enable conditional filtering of the results. For example, a user can fetch work load errors based on the error code or message.

DWH Management Home Page

Query System tables

Amorphic allows querying all the system tables that fall under STL, SVL and STV category along with AWSDMS_APPLY_EXCEPTIONS, SYS_CONNECTION_LOG, SYS_LOAD_ERROR_DETAIL and SYS_QUERY_HISTORY.

Below are the few examples to show how to query system-tables:

  • STL_LOAD_ERRORS : Contains a history of all Redshift load errors. Load Errors

  • STL_S3CLIENT_ERROR : Records errors encountered by a slice while loading a file from Amazon S3. S3 client errors

  • STL_WLM_ERROR : Records all Work load management related errors as they occur. Work load management errors

  • STL_TR_CONFLICT : Records all transaction conflicts with database tables. Transaction conflicts info

  • STV_SESSIONS : Lists all current active cluster connections. Active sessions

  • STV_RECENTS : Lists all current running queries. Active queries

  • STV_LOCKS : Lists all existing table locks. Table locks

  • AWSDMS_APPLY_EXCEPTIONS : Provides information about DMS exceptions AWSDMS exception

  • STL_LOAD_ERROR_DETAIL : Contains a history of all Redshift load errors. Load Errors Detail

  • SYS_CONNECTION_LOG : Lists all current active cluster connections. Connection Log

  • SYS_QUERY_HISTORY : Lists all current running queries. Active queries

    Note

    For Redshift Serverless only SYS_CONNECTION_LOG, SYS_QUERY_HISTORY and SYS_LOAD_ERROR_DETAIL system tables are available.

Kill Sessions/Queries

Amorphic allows user to kill the sessions and queries.

Kill Session or Query

Queries/Loads

This section offers details about queries and loads running queries within the database. You can utilize this information to identify and troubleshoot queries that are experiencing prolonged processing times and causing bottlenecks, thereby hindering the efficient processing of other queries.

Queries and Loads

User can filter the results based on the following filters:

- All: Returns all the queries that ran in the specified duration
- Completed: Returns the successful queries
- Aborted: Returns the queries that were aborted
- Short: Returns queries which took less than 10 seconds to execute
- Medium: Returns queries which took more than 10 sec but less than 10 minutes to execute
- Long: Return queries which took more than 10 minutes to execute
- Loads: Returns the load queries
- Top 100: Returns the top 100 queries by duration

DWH Metadata Access

Amorphic allows users to Enable/disable other, DWH users to list schemas or tables inside an authorized tenant.

Users can list and select the authorized tenant to which they belong and update their access to the user catalog. Multi tenancy Update metadata access

WLM Queues

Amorphic management console allows you to create, edit and delete Redshift Workload Management queues so that user queries are appropriately routed and picked based on the priority of the user's work. WLM Queues

When you turn on concurrency scaling, Amazon Redshift will automatically add more power to your cluster to handle an increase in read and write queries. You can set the priority of your queries in a queue by giving them a priority value. This priority is set for the whole queue and applies to all the queries in that queue.

As part of managing your workload, you can also set up rules to monitor your queries. These rules use metrics to define how your queries should perform and what should happen if they don't meet those performance standards.

Each rule can have up to three conditions, or things to check for, and one action to take if the conditions are met. These conditions check for a metric, like how long a query takes, and compare it to a value using =, <, or >. The actions that can be taken if the conditions are met are to log the query, to stop it, or to change its priority.

Here are the different metrics you can use in your rules and their minimum and maximum values

Metric NameMin ValueMax Value
Query Execution Time086399
Query Queue Time086399
Query CPU Time0999999
Blocks Read01048575
Scan Row Counts0999999999999999
Memory to Disk0319815679
CPU Skew099
I/O Skew099
Rows Joined0999999999999999
Nested Loop Join in Row Count0999999999999999
Note
  • A maximum of 15 queues can be created.
  • An overall of 25 rules can be applied on all queues
  • There can be maximum 3 predicates per rule
  • WLM Queues not applicable for Redshift Serverless

Access Parity Repair/Report

Amorphic admins can generate Access Parity Report, any discrepancies present in DynamoDB or DWH access will be present in the report. Also, admins will receive the Access Parity Report through Email. Admins can also trigger the repair process which will fix the discrepancies. A separate email consisting of Access Parity repair report will be sent to the admins.

A preview of the latest Access Parity Report will be available in the page shown below. Preview Access Parity Report

Access Parity Repair/Report processes can be triggered as shown below. Generate Access Parity Report