Use S3 storage for synthetic data

To use an AWS S3 bucket as a data source or destination for your synthetic data, you need to create an AWS S3 connector.

If you want to store the generated synthetic data in a separate S3 bucket, you need to create a second destination S3 connector that points to that bucket.

Prerequisites

To create an AWS S3 connector, you need the AWS S3 connection details.

Use long-term credentials that include an access key and a secret key.
To use AWS S3 paths containing partitioned Parquet datasets, your AWS credentials must have the s3:ListBucket permission.

Steps

From the Connectors page, select S3 Storage under the Connect your data header.
On the New connector modal, configure the connector.

Field	Description
Name	Enter a name that you can distinguish from other connectors.
Access type	Select whether you want to use the connector as a source or destination.
Access key	The AWS access key.
Secret key	The AWS secret key.
Endpoint URL (optional)	The endpoint URL of your S3-compatible storage service. If you use Amazon S3, you can leave this field empty. If you use a different S3-compatible storage service, enter the endpoint URL of the service. For example: `https://play.min.io:9000`.
CA certificate (optional)	To use an encrypted connection, select Use SSL and upload your certificate.

Click Save to save your new AWS storage connector.
- MOSTLY AI tests the connection. If you see an error, check the connection details, update them, and click Save again.
- You can click Save anyway to save the connector and disregard any errors.

What’s next

Depending on whether you created a source or a destination connector, you can use the connector as:

Data source for a new generator
Data destination for a new synthetic dataset