pc.create_connection_snowflake
pc.create_connection_snowflake(name, user, password, account, warehouse, database, schema)
Create connection with Snowflake Cloud Data Warehouse.
Parameters:
name: str
Name of connection.
user: str
User to authenticate as.
password: str
Password used to authenticate.
account: str
Account ID.
warehouse: str
Warehouse to use to query.
database: str
Database name to pull tables from.
schema: str
Schema to query against.
Returns:
Examples:
Connect to a Snowflake Data Warehouse.
connection = pc.create_connection_snowflake('Snowflake Connection', '{Username}', '{Password}', '{Account Identifier}', '{Warehouse Name}', '{Database Name}', '{Schema Name}')
Snowflake RBAC Requirements
In order to connect to a Snowflake data source, we need to ensure the user has the proper roles and privileges granted to view the tables that will be connected to Predibase, in addition to being able to start/stop compute warehouses.
We recommend creating a new user that is granted a role with, at a minimum, the MANAGE WAREHOUSES
privilege (Check out the Snowflake docs
for more detailed information):
Here is an example of how to do this within the snowflake UI:
-- Create user
CREATE USER PREDIBASE_USER PASSWORD='a secure password goes here';
-- Create role
CREATE ROLE manage_wh_role;
GRANT MANAGE WAREHOUSES ON ACCOUNT TO ROLE manage_wh_role;
-- Grant role to user and make it the default role
GRANT ROLE manage_wh_role TO USER PREDIBASE_USER;
ALTER USER PREDIBASE_USER SET DEFAULT_ROLE = manage_wh_role;
Configuration
The following values are needed to connect to Snowflake:
- Username: User to authenticate as - this is the username you use to login to the Snowflake UI.
- Password: Password used to authenticate - this is the password you use to login to the Snowflake UI.
- Account Identifier: Account ID - generally in the format of
<organization_id>-<account_name>
or<organization_id>.<account_name>
. - Warehouse: The virtual warehouse being used to perform operations within your Snowflake account.
- Database: The database containing the schemas with your tables.
- Schema: The schema containing the tables you want to connect to Predibase
Please refer to the details below for more specific instructions on how to find these values.
Username
This is the username you use to login to the Snowflake UI. It is not the email associated with your Snowflake account. Check out the screenshot below for an example of where to find this.
Password
This is the password you use to login to the Snowflake UI. Check out the screenshot below for an example of where to find this.
Account Identifier
The Account Identifier uniquely identifies your Snowflake account within your organization. It generally comes in the format
<organization_id>-<account_name>
or <organization_id>.<account_name>
. You can find this by
hovering over your account tile in bottom right section of the snowflake UI and then clicking the copy button. Take a
look at the screenshot below for an example.
Warehouse
A virtual warehouse, often referred to simply as a “warehouse”, is a cluster of compute resources in Snowflake. A
warehouse provides the required resources, such as CPU, memory, and temporary storage, to perform a variety of operations
on your Snowflake data. You can find this by clicking the Warehouses
tab under the Admin
menu in the left hand navigation
bar of the Snowflake UI. We have an example below for reference.
Database
All data in Snowflake is stored in databases. You can find this by clicking the Databases
tab under the Data
menu in
the left hand navigation bar of the Snowflake UI. Databases contain Schemas, which contain the objects that we want to
connect to Predibase, such as tables and views. We have an example below for reference.
Schema
A Schema is a logical grouping of database objects, such as tables, views. We have an example below for reference.
Once you enter all your Snowflake credentials, you will be prompted to select which tables you want to connect to Predibase. Every table you select will be added as a new Predibase dataset which you can then use to train models. You don't have to import every table now, you can always come back and add more tables later.