是否可以使用内置BigQueryIO.Write的Dataflow中的BigQuery load config使用setSchemaUpdateOptions(ALLOW_FIELD_ADDITION)?
IwouldliketousetheexperimentaloptionthatallowsmetoupdateaBigQueryschemawhenperformingaloadjob.我想使用允许我在执行加载作业时更新BigQuery架构的实验选项。I'musingDataflowandtheb...
如何在Dataflow中创建用户定义的计数器?
HowcanIcreatemyowncountersinmyDoFns?如何在DoFns中创建自己的计数器?InmyDoFnI'dliketoincrementacountereverytimeaconditionismetwhenprocessingarecord.I'dlikethiscount...
在启动时运行Google Dataflow作业
OurGoogleCloudDataflowpipelineprogramcallssomelibrarywhichdynamicallylinksto*.sofiles,sotorunitIneedtosetlinuxenvironmentvariableLD_LIBRARY_PATH.There...
Google Cloud Dataflow输出到Cassandra
WhatisthebestwaytowriteGoogleCloudDataflowoutputtoCassandra?将GoogleCloudDataflow输出写入Cassandra的最佳方法是什么?Idon'tseemtofindmanypeopledoingit.Aftersearching...
如何使用批处理从DataFlow中的PubSub读取
InSDK1.9.1inPubsubsourcetherewerePubsubIO.Read.maxReadTimeandPubsubIO.Read.maxNumRecordsmethodsavailable.Thosemethodsallowedtocreateboundedcollectionf...
无法写入bigquery - 权限被拒绝:Apache Beam Python - Google Dataflow
Ihavebeenusingapachebeampythonsdkusinggoogleclouddataflowserviceforquitesometimenow.我使用谷歌云数据流服务已经使用apachebeampythonsdk已经有一段时间了。Iwassettingdataflowupfo...
迁移到dataflow 2.x后java.lang.NoClassDefFoundError:org / apache / beam / sdk / runners / PipelineRunner
Gettingruntimeerror:获取运行时错误:"java.lang.NoClassDefFoundError:org/apache/beam/sdk/runners/PipelineRunner"eventhoughIhavebelowinmypom.xml即使我在我的pom.xml下面&...
如何在执行相同的Dataflow管道期间向BigQuery写入计算的模式?
Myscenarioisavariationontheonediscussedhere:HowdoIwritetoBigQueryusingaschemacomputedduringDataflowexecution?我的场景是这里讨论的一个变体:如何使用在Dataflow执行期间计算的模式写入Bi...
如何在Dataflow中使用BigQuery Standard SQL?
IwouldliketorunasimplequeryusingBigQueryStandardSQLwithindataflowbutIcan'tfindwheretoenablethisoption.HowcanIdothat?我想在数据流中使用BigQueryStandardSQL运行一个简单...
如何在现有的maven项目中使用Dataflow?
WhatdependenciesandothermodificationsdoIneedtomaketomypomfilesothatIcanstartusingtheDataflowSDKwithmyexistingproject?我需要对我的pom文件进行哪些依赖性和其他修改,以便我可以开始在现...
在Google Cloud Dataflow中发布嵌套TableRow时的无限递归
I'mtryingtopassaTableRowI'vegeneratedbetweenstagesofmypipeline,andIgetthefollowingerror:我正在尝试传递我在管道阶段之间生成的TableRow,并且我收到以下错误:Exceptioninthread"main"co...
Google Cloud DataFlow PubSubIO不会从完整主题中读取
I'mtryingtorunapipelineinGoogleCloudDataFlow,in"Streaming"mode.ThepipelineshouldreadfromaPubSubtopic,howeveritdoesn'tactuallyreadfromthetopicuntilIdel...
Google DataFlow python管道写入失败
I'mrunningasimpleDataFlowpipelinew/thePythonSDKforcountingkeywords.Thejobrunsfineforpre-processingtheinputdata,butitfailsforgrouping/outputstepswithth...
apache_beam.transforms.util.Reshuffle()不适用于GCP Dataflow
Ihaveupgradedtothelatestapache_beam[gcp]packageviapipinstall--upgradeapache_beam[gcp].However,InoticedthatReshuffle()doesnotappearinthe[gcp]distributi...
如何从jar提交Dataflow作业?
ForreproducibilityIwanttobeabletobuildjarscontainingdataflowjobsandthenrunthemwithdifferentparameters(e.g.promotethemthroughdifferentaccounts).Thiswil...
无法在DataFlow Apache beam中创建通用日期转换类
IamtryingtocreateagenericclassfordateconversioninDataFlowwithbelowcode:我正在尝试使用以下代码在DataFlow中创建日期转换的泛型类:classDateConversion{privatestaticfinalStringBuf...
在Cloud Dataflow中使用Beam SDK
WearecurrentlyusingGoogle'sCloudDataflowSDK(1.6.0)torundataflowjobsinGCP,however,weareconsideringmovingtotheApacheBeamSDK(0.1.0).Wewillstillberunningo...
Dataflow pipline选项的服务帐户凭据
UpgradingfromDataflow1.9toBeam0.4.0.ThemethodsonGcpOptionstosetserviceaccountname(setServiceAccountName)andkeyfile(setServiceAccountKeyFile)arenolonge...
使用Google DataFlow直接将数据流式传输到Cloud SQL的简单查询
SoIamworkingonalittleprojectthatsetsupastreamingpipelineusingGoogleDataflowandapachebeam.Iwentthroughsometutorialsandwasabletogetapipelineupandrunning...
是否可以使用Google Dataflow处理Google Analytics数据?
IwouldliketouseGoogleDataflowtoprocessGoogleAnalyticsdatafrommanywebsitesandstoretheresultsinaGoogleSQL.我想使用GoogleDataflow处理来自许多网站的GoogleAnalytics数据,并...