将字符串拆分为最大长度为X的片段 - 仅在空格处分割

时间:2022-07-03 21:37:53

I have a long string which I would like to break into pieces, of max X characters. BUT, only at a space (if some word in the string is longer than X chars, just put it into its own piece).

我有一个很长的字符串,我想分成几行,最多X个字符。但是,只在一个空格(如果字符串中的某些字长于X字符,只需将其放入自己的部分)。

I don't even know how to begin to do this ... Pythonically

我甚至不知道如何开始这样做......用Python来说

pseudo code:

declare a list
while still some string left:
   take the fist X chars of the string
   find the last space in that
   write everything before the space to a new list entry
   delete everything to the left of the space

Before I code that up, is there some python module that can help me (I don't think that pprint can)?

在我编写代码之前,是否有一些python模块可以帮助我(我不认为pprint可以)?

1 个解决方案

#1


14  

This is how I would approach it: First, split the text into words. Start with the first word in a line and iterate the remaining words. If the next word fits on the current line, add it, otherwise finish the current line and use the word as the first word for the next line. Repeat until all the words are used up.

这就是我接近它的方法:首先,将文本分成单词。从一行中的第一个单词开始,并迭代剩余的单词。如果下一个单词适合当前行,则添加它,否则完成当前行并将该单词用作下一行的第一个单词。重复,直到所有单词都用完为止。

Here's some code:

这是一些代码:

text = "hello, this is some text to break up, with some reeeeeeeeeaaaaaaally long words."
n = 16

words = iter(text.split())
lines, current = [], next(words)
for word in words:
    if len(current) + 1 + len(word) > n:
        lines.append(current)
        current = word
    else:
        current += " " + word
lines.append(current)

Update: As pointed out in comments by @swenzel, there is in fact a module for that: textwrap. This will produce the same result as above code (and it will also break on hyphens):

更新:正如@swenzel的评论所指出的,实际上有一个模块:textwrap。这将产生与上面代码相​​同的结果(并且它也会在连字符上断开):

import textwrap
lines = textwrap.wrap(text, n, break_long_words=False)

#1


14  

This is how I would approach it: First, split the text into words. Start with the first word in a line and iterate the remaining words. If the next word fits on the current line, add it, otherwise finish the current line and use the word as the first word for the next line. Repeat until all the words are used up.

这就是我接近它的方法:首先,将文本分成单词。从一行中的第一个单词开始,并迭代剩余的单词。如果下一个单词适合当前行,则添加它,否则完成当前行并将该单词用作下一行的第一个单词。重复,直到所有单词都用完为止。

Here's some code:

这是一些代码:

text = "hello, this is some text to break up, with some reeeeeeeeeaaaaaaally long words."
n = 16

words = iter(text.split())
lines, current = [], next(words)
for word in words:
    if len(current) + 1 + len(word) > n:
        lines.append(current)
        current = word
    else:
        current += " " + word
lines.append(current)

Update: As pointed out in comments by @swenzel, there is in fact a module for that: textwrap. This will produce the same result as above code (and it will also break on hyphens):

更新:正如@swenzel的评论所指出的,实际上有一个模块:textwrap。这将产生与上面代码相​​同的结果(并且它也会在连字符上断开):

import textwrap
lines = textwrap.wrap(text, n, break_long_words=False)