若要在 AWS DMS 中使用 GZIP 压缩 Parquet 数据格式,需执行以下步骤:
{ "TargetMetadata": { ... "Parquet_GZIP": true, ... } }
aws s3 cp replication-task-settings.json s3://bucket-name/folder-name/
{ "S3Settings": { "ExternalTableDefinition": "SET parquet.gzip=true;", ... }, ... }
以下是使用 GZIP 压缩的示例代码:
{ "Type": "AWS::DMS::ReplicationTask", "Properties": { "MigrationType": "full-load", "ReplicationTaskSettings": { "BucketName": "bucket-name", "BucketFolder": "folder-name", "FileName": "replication-task-settings.json" }, "SourceEndpointArn": "source-endpoint-arn", "TargetEndpointArn": "target-endpoint-arn", "ReplicationInstanceArn": "replication-instance-arn", "TableMappings": "table-mappings", "Tags": [{"Key": "Name", "Value": "replication-task-name"}] } }