从tar.gz文件中读取xml

时间:2023-01-15 08:07:55

I am working with some data file export from MAC system, I guess. I received a file name 20110205.tar then I tried to look at content inside which gave me just raw file ?BIN. My friend helped me to extract it to have a bunch of xml file with the name is in time format: "2011-03-15T23_57_59Z.xml", "2011-03-15T23_58_00Z.xml". I tried with XML package with some commands like xmlTree, xmlTreeParse, asXMLNode then I completely stuck. When I opened the xml file by notepad I have something like: (my friend used Python to to this but I have no idea about Python) I also tried with some packages like epidata but it seems many packages are not available for more. The extracted files I do winrar it and upload to mediafire: http://www.mediafire.com/?ot8vt0wdw5c3oc1 <asdiOutput xmlns="http://tfm.faa.gov/tfms/TFMS_XIS" xmlns:nxce="http://tfm.faa.gov/tfms/NasXCoreElements" xmlns:mmd="http://tfm.faa.gov/tfms/MessageMetaData" xmlns:nxcm="http://tfm.faa.gov/tfms/NasXCommonMessages" xmlns:idr="http://tfm.faa.gov/tfms/TFMS_IDRS" xmlns:xis="http://tfm.faa.gov/tfms/TFMS_XIS" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://tfm.faa.gov/tfms/TFMS_XIS http://localhost:58489/tfms/schema/TFMS_XIS.xsd" timestamp="2011-03-15T23:57:59Z"> <asdiMessage sourceFacility="CCZM" sourceTimeStamp="2011-03-15T23:57:27Z" trigger="TZ"> <trackInformation> <nxcm:aircraftId>UAL966</nxcm:aircraftId> <nxcm:speed>470</nxcm:speed> <nxcm:reportedAltitude> <nxce:assignedAltitude> <nxce:simpleAltitude>350</nxce:simpleAltitude> </nxce:assignedAltitude> </nxcm:reportedAltitude> <nxcm:position> <nxce:latitude> <nxce:latitudeDMS degrees="45" minutes="40" direction="NORTH"/> </nxce:latitude> <nxce:longitude> <nxce:longitudeDMS degrees="056" minutes="58" direction="WEST"/> </nxce:longitude> </nxcm:position> </trackInformation> </asdiMessage> <asdiMessage sourceFacility="CCZM" sourceTimeStamp="2011-03-15T23:57:27Z" trigger="TZ"> <trackInformation> <nxcm:aircraftId>UAL936</nxcm:aircraftId> <nxcm:speed>470</nxcm:speed> <nxcm:reportedAltitude> <nxce:assignedAltitude> <nxce:simpleAltitude>350</nxce:simpleAltitude> </nxce:assignedAltitude> </nxcm:reportedAltitude> <nxcm:position> <nxce:latitude> <nxce:latitudeDMS degrees="44" minutes="43" direction="NORTH"/> </nxce:latitude> <nxce:longitude> <nxce:longitudeDMS degrees="062" minutes="42" direction="WEST"/> </nxce:longitude> </nxcm:position> </trackInformation> </asdiMessage>

我想从MAC系统导出一些数据文件。我收到了一个文件名20110205.tar然后我试着看看里面的内容给了我原始文件?BIN。我的朋友帮我提取了一堆xml文件,其名称是时间格式:“2011-03-15T23_57_59Z.xml”,“2011-03-15T23_58_00Z.xml”。我尝试使用XML包和一些命令,如xmlTree,xmlTreeParse,asXMLNode然后我完全卡住了。当我用记事本打开xml文件时,我有类似的东西:(我的朋友用Python来解决这个问题,但我不知道Python)我也试过像epidata这样的软件包,但似乎很多软件包都没有。提取的文件我做了winrar并上传到mediafire:http://www.mediafire.com/ ?ot8vt0wdw5c3oc1 UAL966 470 350 < nxce:纬度> UAL936 470 350 :longitudedms> :longitude> :latitudedms> :latitude> :position> :simplealtitude> :assignedaltitude> :reportedaltitude> :speed> :aircraftid> :经度> :latitudedms> :position> :simplealtitude> :assignedaltitude> :reportedaltitude> :aircraftid>

Please, anyone help me. I want to do anything in R. 1. extract the tar file and decode the raw files become xml file 2. read the data in multiple xml extracted Thanks in advance!!!

拜托,有人帮帮我。我想在R中做任何事情.1。解压tar文件并解码原始文件成为xml文件2.读取多个xml中提取的数据提前谢谢!!!

1 个解决方案

#1


1  

Depending on your operating system, the R untar command might help; see ?untar. As an example of using XML, we can load the document

根据您的操作系统,R untar命令可能会有所帮助;看到了吗?作为使用XML的示例,我们可以加载文档

library(XML)
xml = xmlParse("2011-03-15T23_57_59Z.xml")

then query it using the xpath language (see especially section 2.5), e.g., for aircraft id and longitude

然后使用xpath语言(特别参见第2.5节)查询它,例如,用于飞机id和经度

> xpathSApply(xml, "//nxcm:aircraftId", xmlValue)
[1] "UAL966" "UAL936"
> xpathSApply(xml, "//nxce:longitudeDMS/@degrees")
degrees degrees 
  "056"   "062" 

There are also convenience functions such as xmlToDataFrame, which might be fun to explore.

还有便利功能,例如xmlToDataFrame,这可能很有趣。

#1


1  

Depending on your operating system, the R untar command might help; see ?untar. As an example of using XML, we can load the document

根据您的操作系统,R untar命令可能会有所帮助;看到了吗?作为使用XML的示例,我们可以加载文档

library(XML)
xml = xmlParse("2011-03-15T23_57_59Z.xml")

then query it using the xpath language (see especially section 2.5), e.g., for aircraft id and longitude

然后使用xpath语言(特别参见第2.5节)查询它,例如,用于飞机id和经度

> xpathSApply(xml, "//nxcm:aircraftId", xmlValue)
[1] "UAL966" "UAL936"
> xpathSApply(xml, "//nxce:longitudeDMS/@degrees")
degrees degrees 
  "056"   "062" 

There are also convenience functions such as xmlToDataFrame, which might be fun to explore.

还有便利功能,例如xmlToDataFrame,这可能很有趣。