解析WebRequestMethods.Ftp的c#类。ListDirectoryDetails FTP响应

时间:2022-05-21 05:48:41

I'm creating a service to monitor FTP locations for new updates and require the ability to parse the response returned from a FtpWebRequest response using the WebRequestMethods.Ftp.ListDirectoryDetails method. It would be fairly easy if all responses followed the same format, but different FTP server software provide different response formats.

我正在创建一个服务来监视FTP位置以获得新的更新,并要求能够使用WebRequestMethods.Ftp解析FtpWebRequest响应返回的响应。ListDirectoryDetails方法。如果所有的响应都遵循相同的格式,但是不同的FTP服务器软件提供不同的响应格式,那么这将是相当容易的。

For example one might return:

例如,可以返回:

08-10-11  12:02PM       <DIR>          Version2
06-25-09  02:41PM            144700153 image34.gif
06-25-09  02:51PM            144700153 updates.txt
11-04-10  02:45PM            144700214 digger.tif

And another server might return:

另一个服务器可能返回:

d--x--x--x    2 ftp      ftp          4096 Mar 07  2002 bin
-rw-r--r--    1 ftp      ftp        659450 Jun 15 05:07 TEST.TXT
-rw-r--r--    1 ftp      ftp      101786380 Sep 08  2008 TEST03-05.TXT
drwxrwxr-x    2 ftp      ftp          4096 May 06 12:24 dropoff

And other differences have been observed also so there's likely to be a number of subtle differences I haven't encountered yet.

其他的差异也被观察到所以可能有一些细微的差异我还没有遇到过。

Does anyone know of a fully managed (doesn't require access to external dll on Windows) C# class that handles these situations seamlessly?

有人知道有一个完全管理的(不需要访问Windows上的外部dll) c#类可以无缝地处理这些情况吗?

I only need to list the contents of a directory with the following details: File/directory name, last updated or created timestamp, file/directory name.

我只需要列出包含以下细节的目录内容:文件/目录名、最近更新或创建的时间戳、文件/目录名。

Thanks in advance for any suggestions, Gavin

谢谢你的建议,Gavin

4 个解决方案

#1


8  

For the first (DOS/Windows) listing this code will do:

对于列出的第一个(DOS/Windows),以下代码将会:

FtpWebRequest request = (FtpWebRequest)WebRequest.Create("ftp://ftp.example.com/");
request.Credentials = new NetworkCredential("user", "password");
request.Method = WebRequestMethods.Ftp.ListDirectoryDetails;
StreamReader reader = new StreamReader(request.GetResponse().GetResponseStream());

string pattern = @"^(\d+-\d+-\d+\s+\d+:\d+(?:AM|PM))\s+(<DIR>|\d+)\s+(.+)$";
Regex regex = new Regex(pattern);
IFormatProvider culture = CultureInfo.GetCultureInfo("en-us");
while (!reader.EndOfStream)
{
    string line = reader.ReadLine();
    Match match = regex.Match(line);
    DateTime modified =
       DateTime.ParseExact(
           match.Groups[1].Value, "MM-dd-yy  hh:mmtt", culture, DateTimeStyles.None);
    long size = (match.Groups[2].Value != "<DIR>") ? long.Parse(match.Groups[2].Value) : 0;
    string name = match.Groups[3].Value;

    Console.WriteLine(
        "{0,-16} size = {1,9}  modified = {2}",
        name, size, modified.ToString("yyyy-MM-dd HH:mm"));
}

You will get:

你将得到:

Version2         size =         0  modified = 2011-08-10 12:02
image34.gif      size = 144700153  modified = 2009-06-25 14:41
updates.txt      size = 144700153  modified = 2009-06-25 14:51
digger.tif       size = 144700214  modified = 2010-11-04 14:45

For the other (*nix) listing, see my answer to Parsing FtpWebRequest ListDirectoryDetails line.

关于另一个(*nix)清单,请参见我对解析FtpWebRequest ListDirectoryDetails行的回答。


But, actually trying to parse the listing returned by the ListDirectoryDetails is not the right way to go.

但是,实际上尝试解析ListDirectoryDetails返回的列表并不是正确的方法。

You want to use an FTP client that supports the modern MLSD command that returns a directory listing in a machine-readable format specified in the RFC 3659. Parsing the human-readable format returned by the ancient LIST command (used internally by the FtpWebRequest for its ListDirectoryDetails method) should be used as the last resort option, when talking to obsolete FTP servers, that do not support the MLSD command (like the Microsoft IIS FTP server).

您希望使用一个支持现代MLSD命令的FTP客户端,该命令以RFC 3659中指定的机器可读格式返回目录列表。解析由古老的列表命令返回的人类可读格式(在内部使用FtpWebRequest的ListDirectoryDetails方法),当与过时的FTP服务器通信时,它不支持MLSD命令(比如Microsoft IIS FTP服务器)。


For example with WinSCP .NET assembly, you can use its Session.ListDirectory or Session.EnumerateRemoteFiles methods.

例如,对于WinSCP . net程序集,您可以使用它的会话。ListDirectory或会话。EnumerateRemoteFiles方法。

They internally use the MLSD command, but can fall back to the LIST command and support dozens of different human-readable listing formats.

它们在内部使用MLSD命令,但可以返回到LIST命令,并支持几十种不同的人类可读的列表格式。

The returned listing is presented as collection of RemoteFileInfo instances with properties like:

返回的列表显示为RemoteFileInfo实例的集合,其属性如下:

  • Name
  • 的名字
  • LastWriteTime (with correct timezone)
  • LastWriteTime(正确的时区)
  • Length
  • 长度
  • FilePermissions (parsed into individual rights)
  • 文件权限(解析为个*限)
  • Group
  • 集团
  • Owner
  • 老板
  • IsDirectory
  • IsDirectory
  • IsParentDirectory
  • IsParentDirectory
  • IsThisDirectory
  • IsThisDirectory

(I'm the author of WinSCP)

(我是WinSCP的作者)


Most other 3rd party libraries will do the same. Using the FtpWebRequest class is not reliable for this purpose. Unfortunately, there's no other built-in FTP client in the .NET framework.

大多数第三方图书馆也会这么做。使用FtpWebRequest类是不可靠的。不幸的是,. net框架中没有其他内置的FTP客户端。

#2


5  

I'm facing this same problem and have built a simple (albeit not very robust) solution using a Regex to parse out the relevant information from each line using capture groups:

我正面临着同样的问题,并使用Regex构建了一个简单的(尽管不是非常健壮的)解决方案,使用捕获组解析每一行的相关信息:

public static Regex FtpListDirectoryDetailsRegex = new Regex(@".*(?<month>(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec))\s*(?<day>[0-9]*)\s*(?<yearTime>([0-9]|:)*)\s*(?<fileName>.*)", RegexOptions.Compiled | RegexOptions.IgnoreCase);

You can then extract the values out of the capture groups by:

然后,您可以通过以下方法从捕获组中提取值:

        string ftpResponse = "-r--r--r-- 1 ftp ftp              0 Nov 19 11:08 aaa.txt";
        Match match = FtpListDirectoryDetailsRegex.Match(ftpResponse);
        string month = match.Groups["month"].Value;
        string day = match.Groups["day"].Value;
        string yearTime = match.Groups["yearTime"].Value;
        string fileName = match.Groups["fileName"].Value;

Some things not note are:

有些东西不值得注意:

  • this will only work for directory responses with the format described found in the ftpResponse variable above. In my case I'm lucky to only be accessing the same FTP server each time and so it is unlikely that the response format will change.
  • 这只适用于上面ftpResponse变量中描述的格式的目录响应。在我的例子中,幸运的是每次只能访问相同的FTP服务器,所以不太可能改变响应格式。
  • the yearTime variable can represent EITHER the year or the time of the file's timestamp. You will need to parse this manually by looking for an instance of the colon : character which will indicate that this capture group contains a time rather than the year
  • yearTime变量可以表示文件的时间戳的年份或时间。您将需要通过查找冒号:character的实例来手动解析它,它将表明这个捕获组包含时间而不是年份

#3


4  

One solution I came across is EdtFTPnet

我遇到的一个解决方案是EdtFTPnet。

EdtFTPnet seems to be quite a feature packed solution that handles lots of different FTP options so is ideal.

EdtFTPnet似乎是一个功能强大的解决方案,可以处理许多不同的FTP选项,因此是理想的。

It's the free open source solution that I've how employed for http://www.ftp2rss.com (a little tool I needed myself but figured might be useful to others also).

这是我为http://www.ftp2rss.com使用的免费开源解决方案(我自己需要的一个小工具,但可能对其他人也有用)。

#4


0  

Take a look at Ftp.dll FTP client.

看看Ftp。dll FTP客户端。

It includes automatic directory listing parser for most FTP servers on Windows, Unix and Netware platforms.

它包括Windows、Unix和Netware平台上的大多数FTP服务器的自动目录列表解析器。

Please note that this is a commercial product I developed.

请注意这是我开发的一个商业产品。

#1


8  

For the first (DOS/Windows) listing this code will do:

对于列出的第一个(DOS/Windows),以下代码将会:

FtpWebRequest request = (FtpWebRequest)WebRequest.Create("ftp://ftp.example.com/");
request.Credentials = new NetworkCredential("user", "password");
request.Method = WebRequestMethods.Ftp.ListDirectoryDetails;
StreamReader reader = new StreamReader(request.GetResponse().GetResponseStream());

string pattern = @"^(\d+-\d+-\d+\s+\d+:\d+(?:AM|PM))\s+(<DIR>|\d+)\s+(.+)$";
Regex regex = new Regex(pattern);
IFormatProvider culture = CultureInfo.GetCultureInfo("en-us");
while (!reader.EndOfStream)
{
    string line = reader.ReadLine();
    Match match = regex.Match(line);
    DateTime modified =
       DateTime.ParseExact(
           match.Groups[1].Value, "MM-dd-yy  hh:mmtt", culture, DateTimeStyles.None);
    long size = (match.Groups[2].Value != "<DIR>") ? long.Parse(match.Groups[2].Value) : 0;
    string name = match.Groups[3].Value;

    Console.WriteLine(
        "{0,-16} size = {1,9}  modified = {2}",
        name, size, modified.ToString("yyyy-MM-dd HH:mm"));
}

You will get:

你将得到:

Version2         size =         0  modified = 2011-08-10 12:02
image34.gif      size = 144700153  modified = 2009-06-25 14:41
updates.txt      size = 144700153  modified = 2009-06-25 14:51
digger.tif       size = 144700214  modified = 2010-11-04 14:45

For the other (*nix) listing, see my answer to Parsing FtpWebRequest ListDirectoryDetails line.

关于另一个(*nix)清单,请参见我对解析FtpWebRequest ListDirectoryDetails行的回答。


But, actually trying to parse the listing returned by the ListDirectoryDetails is not the right way to go.

但是,实际上尝试解析ListDirectoryDetails返回的列表并不是正确的方法。

You want to use an FTP client that supports the modern MLSD command that returns a directory listing in a machine-readable format specified in the RFC 3659. Parsing the human-readable format returned by the ancient LIST command (used internally by the FtpWebRequest for its ListDirectoryDetails method) should be used as the last resort option, when talking to obsolete FTP servers, that do not support the MLSD command (like the Microsoft IIS FTP server).

您希望使用一个支持现代MLSD命令的FTP客户端,该命令以RFC 3659中指定的机器可读格式返回目录列表。解析由古老的列表命令返回的人类可读格式(在内部使用FtpWebRequest的ListDirectoryDetails方法),当与过时的FTP服务器通信时,它不支持MLSD命令(比如Microsoft IIS FTP服务器)。


For example with WinSCP .NET assembly, you can use its Session.ListDirectory or Session.EnumerateRemoteFiles methods.

例如,对于WinSCP . net程序集,您可以使用它的会话。ListDirectory或会话。EnumerateRemoteFiles方法。

They internally use the MLSD command, but can fall back to the LIST command and support dozens of different human-readable listing formats.

它们在内部使用MLSD命令,但可以返回到LIST命令,并支持几十种不同的人类可读的列表格式。

The returned listing is presented as collection of RemoteFileInfo instances with properties like:

返回的列表显示为RemoteFileInfo实例的集合,其属性如下:

  • Name
  • 的名字
  • LastWriteTime (with correct timezone)
  • LastWriteTime(正确的时区)
  • Length
  • 长度
  • FilePermissions (parsed into individual rights)
  • 文件权限(解析为个*限)
  • Group
  • 集团
  • Owner
  • 老板
  • IsDirectory
  • IsDirectory
  • IsParentDirectory
  • IsParentDirectory
  • IsThisDirectory
  • IsThisDirectory

(I'm the author of WinSCP)

(我是WinSCP的作者)


Most other 3rd party libraries will do the same. Using the FtpWebRequest class is not reliable for this purpose. Unfortunately, there's no other built-in FTP client in the .NET framework.

大多数第三方图书馆也会这么做。使用FtpWebRequest类是不可靠的。不幸的是,. net框架中没有其他内置的FTP客户端。

#2


5  

I'm facing this same problem and have built a simple (albeit not very robust) solution using a Regex to parse out the relevant information from each line using capture groups:

我正面临着同样的问题,并使用Regex构建了一个简单的(尽管不是非常健壮的)解决方案,使用捕获组解析每一行的相关信息:

public static Regex FtpListDirectoryDetailsRegex = new Regex(@".*(?<month>(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec))\s*(?<day>[0-9]*)\s*(?<yearTime>([0-9]|:)*)\s*(?<fileName>.*)", RegexOptions.Compiled | RegexOptions.IgnoreCase);

You can then extract the values out of the capture groups by:

然后,您可以通过以下方法从捕获组中提取值:

        string ftpResponse = "-r--r--r-- 1 ftp ftp              0 Nov 19 11:08 aaa.txt";
        Match match = FtpListDirectoryDetailsRegex.Match(ftpResponse);
        string month = match.Groups["month"].Value;
        string day = match.Groups["day"].Value;
        string yearTime = match.Groups["yearTime"].Value;
        string fileName = match.Groups["fileName"].Value;

Some things not note are:

有些东西不值得注意:

  • this will only work for directory responses with the format described found in the ftpResponse variable above. In my case I'm lucky to only be accessing the same FTP server each time and so it is unlikely that the response format will change.
  • 这只适用于上面ftpResponse变量中描述的格式的目录响应。在我的例子中,幸运的是每次只能访问相同的FTP服务器,所以不太可能改变响应格式。
  • the yearTime variable can represent EITHER the year or the time of the file's timestamp. You will need to parse this manually by looking for an instance of the colon : character which will indicate that this capture group contains a time rather than the year
  • yearTime变量可以表示文件的时间戳的年份或时间。您将需要通过查找冒号:character的实例来手动解析它,它将表明这个捕获组包含时间而不是年份

#3


4  

One solution I came across is EdtFTPnet

我遇到的一个解决方案是EdtFTPnet。

EdtFTPnet seems to be quite a feature packed solution that handles lots of different FTP options so is ideal.

EdtFTPnet似乎是一个功能强大的解决方案,可以处理许多不同的FTP选项,因此是理想的。

It's the free open source solution that I've how employed for http://www.ftp2rss.com (a little tool I needed myself but figured might be useful to others also).

这是我为http://www.ftp2rss.com使用的免费开源解决方案(我自己需要的一个小工具,但可能对其他人也有用)。

#4


0  

Take a look at Ftp.dll FTP client.

看看Ftp。dll FTP客户端。

It includes automatic directory listing parser for most FTP servers on Windows, Unix and Netware platforms.

它包括Windows、Unix和Netware平台上的大多数FTP服务器的自动目录列表解析器。

Please note that this is a commercial product I developed.

请注意这是我开发的一个商业产品。