从网页中检索csv文件

时间:2022-10-23 15:38:23

I would like to save a csv file from a web page. However, the link on the page does not lead directly to the file, but it calls some kind of javascript, which leads to the opening of the file. In other words, there is no explicit url address for the file i want to download or at least I don't know what it should be. I found a way to download a file by activating Internet Explorer,going to the web page and pressing the link button and then saving the file through the dialog box. This is pretty ugly, and I am wondering if there is a more elegant (and fast) method to retrieve a file without using internet explorer(e.g. by using urllib.retrieve method) The javascript is of the following form (see the comment, it does not let publish the source code...):

我想从网页中保存一个csv文件。但是,页面上的链接并不直接导致文件,但它会调用某种类型的javascript,从而导致文件打开。换句话说,我想要下载的文件没有明确的url地址,或者至少我不知道它应该是什么。我找到了一种方法来下载文件,方法是激活Internet Explorer,转到网页并按下链接按钮,然后通过对话框保存文件。这非常难看,我想知道是否有更优雅(和快速)的方法来检索文件而不使用Internet Explorer(例如通过使用urllib.retrieve方法)javascript具有以下形式(请参阅注释,它不允许发布源代码...):

"CSV"

Any ideas?

Sasha

2 个解决方案

#1


You can look at what the javascript function is doing, and it should tell you exactly where it's downloading from.

你可以看看javascript函数正在做什么,它应该告诉你它从哪里下载。

#2


I had exactly this sort of problem a year or two back; I ended up installing the rhino javascript engine; grepping the javascript out of the target document and evaluating the url within rhino, and then fetching the result.

一两年后我就遇到了这种问题;我最终安装了rhino javascript引擎;将javascript从目标文档中移出并评估rhino中的url,然后获取结果。

#1


You can look at what the javascript function is doing, and it should tell you exactly where it's downloading from.

你可以看看javascript函数正在做什么,它应该告诉你它从哪里下载。

#2


I had exactly this sort of problem a year or two back; I ended up installing the rhino javascript engine; grepping the javascript out of the target document and evaluating the url within rhino, and then fetching the result.

一两年后我就遇到了这种问题;我最终安装了rhino javascript引擎;将javascript从目标文档中移出并评估rhino中的url,然后获取结果。