XQuery:如何将大型xml文件拆分为较小的文件

时间:2022-09-06 22:50:55

we have very large data files, like this one:

我们有非常大的数据文件,如下所示:

<itemList>
 <item>A1</item>
 <item>A2</item>
 <item>A3</item>
 <item>...</item>
 <item>A6000</item>
</itemList>

We have to split them into smaller chunks of a size of 1000 each. So that it looks like this:

我们必须将它们分成更小的块,每块大小为1000。所以它看起来像这样:

<itemList>
 <itemSet>
  <item>A1</item>
  <item>...</item>
  <item>A1000</item>
 <itemSet>
 <itemSet>
  <item>...</item>

What is the best way to split that in XQuery? Any ideas?

在XQuery中拆分它的最佳方法是什么?有任何想法吗?

Thanks a lot

非常感谢

2 个解决方案

#1


4  

A windowed for loop is the best answer (see Ghislain's answer,) but that's only available in XQuery 3, which your processor might not support. In that case, you can roll your own, just like you'd do in any other language:

窗口化循环是最佳答案(请参阅Ghislain的答案),但这仅适用于XQuery 3,您的处理器可能不支持。在这种情况下,您可以自己滚动,就像您使用任何其他语言一样:

declare variable $itemList := <itemList>
 <item>A1</item>
 <item>A2</item>
 <item>A3</item>
 <item>A4</item>
 <item>A5</item>
 <item>A6</item>
 <item>A7</item>
 <item>A8</item>
</itemList>;
declare variable $groupSize := 3;

element itemList {
  for $group in (0 to fn:ceiling(count($itemList/item) div $groupSize) - 1)
  let $groupStart := ($group * $groupSize) +1
  let $groupEnd := ($group + 1) * $groupSize
  return
    element itemGroup {
      $itemList/item[$groupStart to $groupEnd]
    }
}

#2


4  

I'd suggest a windowing query:

我建议一个窗口查询:

<itemList>
{
    for tumbling window $items in $document/item
    start at $i when true()
    end at $j when $j eq $i + 999
    return
        <itemSet>
        {
                $items
        }
        </itemSet>
}
</itemList>

You can test it with Zorba here (I used smaller windows)

您可以在这里使用Zorba进行测试(我使用较小的窗口)

#1


4  

A windowed for loop is the best answer (see Ghislain's answer,) but that's only available in XQuery 3, which your processor might not support. In that case, you can roll your own, just like you'd do in any other language:

窗口化循环是最佳答案(请参阅Ghislain的答案),但这仅适用于XQuery 3,您的处理器可能不支持。在这种情况下,您可以自己滚动,就像您使用任何其他语言一样:

declare variable $itemList := <itemList>
 <item>A1</item>
 <item>A2</item>
 <item>A3</item>
 <item>A4</item>
 <item>A5</item>
 <item>A6</item>
 <item>A7</item>
 <item>A8</item>
</itemList>;
declare variable $groupSize := 3;

element itemList {
  for $group in (0 to fn:ceiling(count($itemList/item) div $groupSize) - 1)
  let $groupStart := ($group * $groupSize) +1
  let $groupEnd := ($group + 1) * $groupSize
  return
    element itemGroup {
      $itemList/item[$groupStart to $groupEnd]
    }
}

#2


4  

I'd suggest a windowing query:

我建议一个窗口查询:

<itemList>
{
    for tumbling window $items in $document/item
    start at $i when true()
    end at $j when $j eq $i + 999
    return
        <itemSet>
        {
                $items
        }
        </itemSet>
}
</itemList>

You can test it with Zorba here (I used smaller windows)

您可以在这里使用Zorba进行测试(我使用较小的窗口)