SQL AI - Your SQL Assistant
SQL AI is the latest feature in the Amorphic Cloud Platform, designed to simplify SQL querying by leveraging advanced Large Language Models (LLMs). This tool empowers users to convert natural language inputs into SQL queries, making it more intuitive to interact with datasets and views hosted on the platform. Whether you need to retrieve specific data, ask analytical questions, or get SQL code generated from casual language, SQL AI makes querying easier and more accessible. This generative AI add-on supports querying in natural language and provides results using models of your choice making it easy to ask questions on data or retrieve ready to use SQL queries to quickly fetch required data from the selected data source. Queries will mostly be used for one of the following categories, them being for natural language suggestions, for generating queries from natural language prompts or for explaining any existing query on top of any available dataset/view.
Key Features
Natural Language to SQL Translation
SQL AI uses powerful LLMs to interpret customer inquiries expressed in natural language, transforming them into executable SQL queries. This significantly reduces the technical expertise needed to query complex datasets and allows users to ask questions in plain language while receiving the corresponding SQL code or results. The system supports domain-specific queries, ensuring that datasets and views in different domains can be easily distinguished, even when they share the same name. Currently, for use purposes only S3-Athena and Redshift datasets and views are supported within SQL-AI.
- It is important to note that when querying on datasets and views which have the same name but are in different domains, use
[domain-name].[dataset-name/view-name]
(Querying is case-sensitive. Resources should be matching the exact naming case as when created.) for accurate results on the correct dataset or view. Alternatively, you can select the dataset or view from the dropdown list available in the window. - It is recommended to have clear column name(s) and column description(s) in the datasets which would provide better context to the model yielding better results.
- The generated queries will be based on the schema of the data source and not the actual data.
Interactive Query Categories
SQL AI supports three main categories of interaction:
- Natural Language Suggestions: Users can input natural language queries to receive suggestions or insights from their data.
- Query Generation: SQL AI generates SQL queries from natural language prompts, allowing users to easily obtain SQL code for their data requests.
- Query Explanation: The tool can also explain existing SQL queries, breaking down each component and clarifying what the query does on a given dataset or view.
Model Customization
SQL AI offers the flexibility to choose between multiple models for query generation and explanation. These models help cater to different language processing and generative AI needs. Currently supported models include:
- Claude 3 Haiku
- Claude 3 Sonnet
Resource Selection
Currently, user can prompt on the resources in one of the following ways:
- Select the resource from the available dropdown resource list.
- Enter just the dataset/view name in the prompt.
- Enter the domain name followed by dataset/view name in the prompt.
Atleast one resource needs to be selected using one of the above ways for query suggestion or generation related prompts to work. We recommend only selecting the resource for which the question is relevant for better results
Selecting more than 20 resources is not recommended as the model responses might be hallucinated due to a wider context
Use Cases
Natural Language Suggestions
Example: "Suggest some queries on the dataset."
SQL AI will generate the natural language questions to generate queries needed to retrieve this data from the appropriate dataset, removing the need for manual SQL coding.
SQL Generation
Example: "Generate a query to retrieve top 10 Sales person sorted in ascending order."
SQL AI converts this natural language prompt into a fully-formed SQL query, allowing you to execute it directly or modify it further.
Once the query is generated, it can be reviewed and then executed from the query engine using the Run Query
button available in the right corner of the query tab
SQL Explanation
Example: You provide an SQL query:
SELECT * FROM
test_domain.test_dataset
WHERE DATE_FORMAT(ModifiedDate, '%Y') = '2006';
Limitations
- The scope of the chatbot is only limited to the actions mentioned above. It cannot be used for asking questions outside this context.
- Query might not always be correct. Precise prompts may lead to much more precise queries and results.
- Follow up prompting on any obtained results are also not optimal.
- Currently identifying and rectifying errors in the generated SQL queries are not supported.
- To execute queries in Redshift on resources that are not in the default tenant, you need to manually add the tenant name. The generated queries will only reference resources using the
<domain>.<table_name>
format, so specifying the tenant is necessary to follow the<tenant_name>.<domain_name>.<table_name>
syntax.