AWSGlue:读/写Parquet文件(文件>50,000)
创始人
2024-09-25 14:31:57
0
  1. 导入所需Python库:
import boto3
import sys
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from pyspark.sql.functions import *
from awsglue.context import GlueContext
from pyspark.sql import SparkSession
from awsglue.dynamicframe import DynamicFrame
  1. 配置AWS Glue作业参数:
args = getResolvedOptions(sys.argv, ['JOB_NAME'])
glueContext = GlueContext(SparkContext.getOrCreate())
job = glueContext.create_job(args['JOB_NAME'], args)
  1. 定义源文件路径和目标文件路径:
source_path = "s3://source-bucket/path/to/parquet/files/"
destination_path = "s3://destination-bucket/path/to/parquet/files/"
  1. 读取数据:
in_dyf = glueContext.create_dynamic_frame_from_options(
            "parquet", {"paths": [source_path]}, "my_source")
  1. 构建转换器:
applymapping1 = ApplyMapping.apply(frame=in_dyf, mappings=[
    # Mapping column names in source to column names in destination
    ("field_name_in_source", "string", "field_name_in_destination", "string"),
    ("field_name_in_source2", "string", "field_name_in_destination2", "string")
], transformation_ctx="applymapping1")
  1. 转换数据类型:
cast3 = applymapping1.toDF().withColumn(
    "numeric_column_name", col("numeric_column_name").cast("decimal(10,2)"))
  1. 写入数据:
out_dyf = DynamicFrame.fromDF(cast3, glueContext, "out_dyf")
glueContext.write_dynamic_frame.from_options(frame=out_dyf, connection_type="s3",
                                             connection_options={"path":destination_path},
                                             format="parquet",
                                             transformation_ctx="out")

相关内容

热门资讯

微扑克辅助软件!德扑之星内部(... 1、微扑克辅助软件!德扑之星内部(透视)好像存在有挂(详细辅助详细教程);代表性(透视辅助软件透明挂...
透视神器!WePoKe外 挂,... 透视神器!WePoKe外 挂,wepower有机器人吗,详细透视德州论坛在进入WePoKe外 挂辅助...
wpk发牌逻辑!pokerwo... wpk发牌逻辑!pokerworld有挂吗(透视)起初真的是有挂(详细辅助安装教程)1、wpk发牌逻...
透视透视!wepoke有挂,w... 透视透视!wepoke有挂,wopoker轻量版外挂,详细透视技巧教程1、下载好wepoke有挂辅助...
wepoke辅助德之星!wpk... wepoke辅助德之星!wpk胜率跟号有关系么(透视)原先存在有挂(详细辅助总结教程)您好,wepo...
透视新版!wepoke辅助技巧... 透视新版!wepoke辅助技巧,wopoker辅助真的假的,详细透视透明教程1.wepoke辅助技巧...
aapoker俱乐部!微扑克发... aapoker俱乐部!微扑克发牌系统(透视)素来真的有挂(详细辅助线上教程);aapoker俱乐部!...
透视代打!wepower辅助器... 透视代打!wepower辅助器,wepoke ai,详细透视AA德州教程1、上手简单,内置详细流程视...
德扑ai智能!扑克时间辅助(透... 德扑ai智能!扑克时间辅助(透视)其实是有挂(详细辅助曝光教程)是一款可以让一直输的玩家,快速成为一...
透视脚本!wepoke有辅助挂... 透视脚本!wepoke有辅助挂,wepoke软件透明挂存在吗,详细透视透明挂教程1、点击下载安装,w...