Use S3 storage for synthetic data
To use an AWS S3 bucket as a data source or destination for your synthetic data, you need to create an AWS S3 connector.
If you want to store the generated synthetic data in a separate S3 bucket, you need to create a second destination S3 connector that points to that bucket.
Prerequisites
To create an AWS S3 connector, you need the AWS S3 connection details.
- Use long-term credentials that include an access key and a secret key.
- To use AWS S3 paths containing partitioned Parquet datasets, your AWS credentials must have the
s3:ListBucket
permission.
Steps
- From the Connectors page, select S3 Storage under the Connect your data header.
- On the New connector modal, configure the connector.
Field | Description |
---|---|
Name | Enter a name that you can distinguish from other connectors. |
Access type | Select whether you want to use the connector as a source or destination. |
Access key | The AWS access key. |
Secret key | The AWS secret key. |
Endpoint URL (optional) | The endpoint URL of your S3-compatible storage service. If you use Amazon S3, you can leave this field empty. If you use a different S3-compatible storage service, enter the endpoint URL of the service. For example: https://play.min.io:9000 . |
CA certificate (optional) | To use an encrypted connection, select Use SSL and upload your certificate. |
- Click Save to save your new AWS storage connector.
- MOSTLY AI tests the connection. If you see an error, check the connection details, update them, and click Save again.
- You can click Save anyway to save the connector and disregard any errors.
- MOSTLY AI tests the connection. If you see an error, check the connection details, update them, and click Save again.
What’s next
Depending on whether you created a source or a destination connector, you can use the connector as:
- Data source for a new generator
- Data destination for a new synthetic dataset