ApacheBeam管道Java:记录未按顺序写入目标文件。
创始人
2024-09-05 11:30:06
0

问题源于具有并行化和异步处理功能的Apache Beam框架可能无法保证数据流在目标文件中以正确的顺序写入。为了解决这个问题,可以使用有序写入器(OrderedWriter)来确保记录按正确的顺序写入目标文件。以下是一个Java代码示例,演示如何使用有序写入器来解决此问题:

PCollection records = ...; // input PCollection
PCollectionView filenamesView = ...; // PCollectionView of the filenames to write to
final TupleTag doneTag = new TupleTag<>();
//Create a new PCollection by assigning a unique, increasing ID to each element
PCollection> keyedRecords =
        records.apply("AddUniqueIds", WithKeys.of((Void) null)).setCoder(KvCoder.of(VarIntCoder.of(), MyRecordCoder.of()));
//Group all elements assigned with the same key (null), and sort all records by id.
PCollection sortedRecords =
        PCollectionList.of(keyedRecords.apply(GroupByKey.create()))
                .apply(ParDo.of(new DoFn>, MyRecord>() {
                    @ProcessElement
                    public void processElement(ProcessContext context) {
                        List sorted = new ArrayList<>();
                        for (MyRecord r : context.element().getValue()) {
                            sorted.add(r);
                        }
                        Collections.sort(sorted, new Comparator() {
                            @Override
                            public int compare(MyRecord o1, MyRecord o2) {
                                // Assumes that MyRecord has a method that returns its id as an int.
                                return Long.compare(o1.getId(), o2.getId());
                            }
                        });
                        for (MyRecord r : sorted) {
                            context.output(r);
                        }
                    }
                })).setCoder(MyRecordCoder.of());
// Create a new representation of the input PCollection where each element is a tuple containing
// the filename and the record it belongs to.
// For example, if the input file was records 0, 1, 2, 3, 4, 5, 6, the output would be:
// ("file0", record 0), ("file1", record 1), ("file2", record 2), ("file3", record 3), ...
final PCollection> keyedOutput = sortedRecords
        .apply("AssignFilename", ParDo.of(new DoFn>() {
            @ProcessElement
            public void processElement(ProcessContext context) {
                List filenames = context.sideInput

相关内容

热门资讯

值得注意的是!werplan免... 值得注意的是!werplan免费挂下载(透视)好像是真的辅助安装(有挂细节)-哔哩哔哩该软件可以轻松...
今日!wepoker模拟器哪个... 今日!wepoker模拟器哪个好用(透视)总是有辅助平台(有挂总结)-哔哩哔哩1、用户打开应用后不用...
透视脚本!pokemmo辅助官... 透视脚本!pokemmo辅助官网(透视)好像是真的辅助插件(真的有挂)-哔哩哔哩1、进入游戏-大厅左...
据监测!wepoker手机助手... 据监测!wepoker手机助手(透视)原来是真的辅助平台(有挂规律)-哔哩哔哩一、wepoker手机...
透视脚本!wpk可以作必弊吗(... 透视脚本!wpk可以作必弊吗(透视)一直真的有辅助神器(有挂秘笈)-哔哩哔哩透视脚本!wpk可以作必...
针对!如何判断wpk辅助软件的... 针对!如何判断wpk辅助软件的真假(透视)确实存在有辅助辅助器(有挂教程)-哔哩哔哩1、不需要AI权...
透视讲解!newpoker脚本... 透视讲解!newpoker脚本(透视)一贯是真的辅助平台(确实有挂)-哔哩哔哩1、进入到newpok...
今日!werplan怎么作必弊... 今日!werplan怎么作必弊(透视)确实存在有辅助神器(有挂规律)-哔哩哔哩进入游戏-大厅左侧-新...
这一现象值得深思!wepoke... 这一现象值得深思!wepoker辅助器安装包定制(透视)总是真的有辅助安装(有挂神器)-哔哩哔哩1、...
更值得关注的是!wepoker... 更值得关注的是!wepoker私人局辅助器怎么用(透视)本来有辅助软件(有人有挂)-哔哩哔哩1、we...