Search Datasets
You can disable Search Datasets and File Search to save costs, but you have to contact an administrator to disable them.
Amorphic enables you to search any query within the dataset and metadata files using the search dataset feature in the Catalog section.
Available Keys for Search:
'DatasetId', 'DatasetName', 'Domain', 'FileType', 'TargetLocation', 'DatasetDescription', 'IsActive', 'LastModifiedBy', 'LastModified', 'Keywords',
'DatasetType', 'SerDe', 'ConnectionType', 'CreatedBy', 'TenantName', 'DatasetSchema', 'DataClassification'
You can type "\*" in the search bar to get a list of all datasets, both accessible and non-accessible, in the account.
Sample Queries:
* - retrieve all datasets
(DatasetName:dataset1) - returns dataset with name 'dataset1'
(DatasetName:dataset*) - returns dataset name starts with 'dataset' - eg: 'dataset1', 'dataset2'
(DatasetName:dataset*) AND (TargetLocation:S3) - returns dataset name starts with 'dataset' and target location s3
(Domain:domain1) AND (TargetLocation: S3 OR redshift) - returns dataset in domain1 and target location is s3 or redshift
Note: Users have the ability to formulate combinations of AND and OR operators to enhance their search for improved, specific outcomes.
You can also sort and filter the results based on different attributes. Some departments or user groups may have access to certain datasets. You can check which datasets you have access to, as well as request access to a dataset you do not have access to.
Customized Dataset Search
Datasets search now has been enhanced to allow users to query (from) datasets that you have access (owner or read-only) to. To use it, use the check box below
For a multi-tenancy deployment, dataset search works a little differently. Since each tenant's data is isolated and remains invisible to other tenants, Amorphic dataset search only returns the datasets which are part of user-accessible tenants.
For example, if user "UserA" has access to the "testorg1" and "testorg2" tenants and is searching for datasets, the search results would only display the datasets which are created under domains which are part of the "testorg1" or "testorg2" tenants.
Repair Search Metadata (API Only)
This option provides user ability to repair search metadata that is stored in the OS cluster. By default, metadata of files uploaded and AI files for S3 datasets are not indexed. To index these files, use the API with the query string parameter 'index_files' set to true.
Resource Path: /opensearchindex/operations?index_files=false
HTTP Method: PUT
Request Payload:
{
"IndexAction": <string> (recreate, delete, create)
}
User can perform below three actions:
- recreate: Deletes and recreates index
- delete: Deletes index
- create: Creates index