When data profiling job(backend job) is failed with an unhandled exception, the job does an inordinate number of retries causing additional cost to the customer.
This issue occurs if the Amorphic deployed with single tenancy and have datasets with data profiling enabled.
Affected Versions: 1.11
, 1.12
, 1.13
, 1.14
, 2.0
, 2.1
Fix Version: 2.2
Root cause(s)
- Because of incorrect failed job retry configuration, data profiling job retries inordinate number of times.
- Unhandled exceptions in data profiling job(scheduled backend job)
- When redshift cluster is paused, data profiling job errors out with timeout exception.
Impact
Account accrues additional cost for unnecessary job executions.
Mitigation
Workaround
Make sure the redshift cluster is in active state around the schedule of data profiling job(everyday 00:00 UTC). Or Disable the data profiling flag on datasets.
Timeline
- 2023-04-06: Bug reported/identified (CLOUD-3209)
- 2023-04-06: Bug triaged