使用Apache Drill通过ODBC / JDBC查询s3问题

时间:2023-02-11 23:07:23

I'm using Apache Drill (v1.10.0) Windows embedded to connect to S3, but am having issues querying successfully unless I use the Drill Explorer client

我正在使用Apache Drill(v1.10.0)Windows嵌入式连接到S3,但是除非我使用Drill Explorer客户端,否则我遇到了成功查询的问题

The ODBC connection works (connection string below)

ODBC连接有效(下面的连接字符串)

CastAnyToVarchar=true;
Catalog=s3citibike;
Schema=default;
HandshakeTimeout=5;
QueryTimeout=180;
TimestampTZDisplayTimezone=local;
NumberOfPrefetchBuffers=5;
StringColumnLength=1024;
ConvertToCast=false

If I use Drill Explorer (direct to Drillbit), I can see the files in s3citibike.default, and view the data (see attached image) but for some reason I cannot see my files when using ODBC with another client such as Excel.

如果我使用Drill Explorer(直接钻取Drillbit),我可以在s3citibike.default中看到这些文件,并查看数据(参见附图)但由于某种原因,当使用ODBC与其他客户端(如Excel)时,我无法看到我的文件。

I can query using sqline, for example the below query returns the dataset sucessfully

我可以使用sqline查询,例如下面的查询成功返回数据集

SELECT * FROM `s3citibike`.`default`.`./201307-citibike-tripdata.csv` LIMIT 100;

I'm kind of guessing I'm just not specifying the folder path correctly, but I've been looking around for a while, tried Catalog = DRILL, schema = s3citibike.default, no avail.

我有点猜测我只是没有正确指定文件夹路径,但我一直在寻找一段时间,尝试了Catalog = DRILL,schema = s3citibike.default,无济于事。

I'd try the drill-jdbc-all-1.10.0.jar JDBC driver for my client, but understand it doesn't work with embedded Windows

我会为我的客户端尝试使用drill-jdbc-all-1.10.0.jar JDBC驱动程序,但是理解它不适用于嵌入式Windows

I was just asked by one of my sales guys if I could get this working for a customer meeting in a couple of hours where an inability to query S3 via Apache Drill ODBC or JDBC is a dealbreaker.

我刚刚被我的一位销售人员问过,如果我能在几个小时内完成客户会议,那么无法通过Apache Drill ODBC或JDBC查询S3是一个交易破坏者。

Can anyone see where I'm going wrong?

谁能看到我哪里出错了?

使用Apache Drill通过ODBC / JDBC查询s3问题

Thanks and regards, Jack

谢谢和问候,杰克

1 个解决方案

#1


1  

Got some feedback from the Apache Drill user group

从Apache Drill用户组获得了一些反馈

"With tools like Excel you will either have to figure out how to enter custom SQL, or if you want the data to be more visible to these tools you will have to create Drill Views and then reference these views from the tool via ODBC/JDBC. Properly define the column name and data types in the Views to make it easier for the end user/tool to process the data (this way you push the work to Drill)."

“使用像Excel这样的工具,您必须弄清楚如何输入自定义SQL,或者如果您希望数据对这些工具更加可见,您必须创建钻取视图,然后通过ODBC / JDBC从工具中引用这些视图在视图中正确定义列名和数据类型,以便最终用户/工具更容易处理数据(这样就可以将工作推送到Drill)。

I created a view in a .tmp schema referencing the schema containing my csv files. I was able to see and query this view successfully in my client

我在.tmp架构中创建了一个视图,引用了包含我的csv文件的架构。我能够在我的客户端中成功查看和查询此视图

#1


1  

Got some feedback from the Apache Drill user group

从Apache Drill用户组获得了一些反馈

"With tools like Excel you will either have to figure out how to enter custom SQL, or if you want the data to be more visible to these tools you will have to create Drill Views and then reference these views from the tool via ODBC/JDBC. Properly define the column name and data types in the Views to make it easier for the end user/tool to process the data (this way you push the work to Drill)."

“使用像Excel这样的工具,您必须弄清楚如何输入自定义SQL,或者如果您希望数据对这些工具更加可见,您必须创建钻取视图,然后通过ODBC / JDBC从工具中引用这些视图在视图中正确定义列名和数据类型,以便最终用户/工具更容易处理数据(这样就可以将工作推送到Drill)。

I created a view in a .tmp schema referencing the schema containing my csv files. I was able to see and query this view successfully in my client

我在.tmp架构中创建了一个视图,引用了包含我的csv文件的架构。我能够在我的客户端中成功查看和查询此视图