Local File
Local files can be uploaded to Predibase and stored in the internal artifact store. We impose a limit of 500MB per file. Local file upload is primarily intended for experimentation and learning Predibase. For production usage, we recommend connecting a data source managed outside of Predibase such as Amazon S3.
Configuration
The following values are needed to upload a local file:
- File Name: path to the file on your local machine
- Name: dataset name to assign to this file in Predibase (must be unique among uploaded files)
The name you provide can be changed once the file has been uploaded, but you cannot edit the file contents after it has been uploaded. This requires deleting and re-uploading the file. The name serves as the equivalent of a table in a database for the purposes of executing PQL queries against a file.
File Formats
Predibase supports all file formats supported by Dask which includes:
- CSV
- Parquet
- HDF5
- ORC
- JSON
- Fixed-Width Text File
Additionally, we also support the following formats for in-memory datasets using Pandas:
- MS Excel
- OpenDocument
- Feather
- Msgpack
- Stata
- SAS
- SPSS
- Python Pickle