Skip to main content

Privacy

VPC deployments of Predibase offer a fully separate dataplane where models are fine-tuned and deployments run.

Where is my data being stored?

Fine-tuning

With the exception of explicitly uploading a file to Predibase (under the 'dataset connection page'), data will never be saved to disk on our servers. When customers initiate a dataset connection to Predibase, we will store the credentials to the dataset in a secure HashiCorp Vault instance. These credentials are read, at run time, by the fine-tuning workers that run in your VPC to access your datasets.

For file uploads, the file is saved to a private cloud storage bucket within your Predibase dataplane cloud account that you manage.

Serving

In principal, Predibase will never log your prompts to your LLM deployments, or the responses that are generated. However, you can enable prompt and response logging via a flag when creating a deployment. Some customers find this valuable for creating new datasets of prompt-response pairs for further fine-tuning models, but by default this feature is turned off for all LLM deployments.

How does my data make it to the LLM deployment running in my VPC?

There are several ways to prompt a model in Predibase. The first is through our UI, which sends a request to the public serving.app.predibase.com endpoint. The request is then sent through our SaaS controlplane and our internal network before it reaches your dataplane, and ultimately your LLM deployment.

The second way is through our SDK, or by manually sending requests to serving.app.predibase.com. Again, this will go through our SaaS controlplane and internal network.

The third way is by using Direct Ingress. Direct Ingress is a feature in Predibase that uses services such as AWS PrivateLink to initiate a direct connection between your application's VPC, and the Predibase dataplane VPC where your deployment runs.

To enable direct ingress in your VPC, please reach out to us.

Once direct ingress is enabled in your VPC, you can create deployments with the flag direct_ingress=True. With this set, you can prompt your LLM using a VPC endpoint instead of serving.app.predibase.com, and the request will route directly from your VPC into the predibase dataplane VPC, through the cloud provider's network. Once in your dataplane VPC, the request is sent directly to your LLM deployment without any hops into our shared SaaS controlplane, or any external network.

Below is an example network diagram of direct-ingress on AWS.