使用python读取一个反向顺序的文件。

时间:2022-09-04 11:29:59

How to read a file in reverse order using python? I want to read a file from last line to first line.

如何使用python读取反序文件?我想从最后一行到第一行读一个文件。

13 个解决方案

#1


55  

for line in reversed(open("filename").readlines()):
    print line.rstrip()

And in Python 3:

在Python 3:

for line in reversed(list(open("filename"))):
    print(line.rstrip())

#2


93  

A correct, efficient answer written as a generator.

一个正确的、有效的答案。

import os

def reverse_readline(filename, buf_size=8192):
    """a generator that returns the lines of a file in reverse order"""
    with open(filename) as fh:
        segment = None
        offset = 0
        fh.seek(0, os.SEEK_END)
        file_size = remaining_size = fh.tell()
        while remaining_size > 0:
            offset = min(file_size, offset + buf_size)
            fh.seek(file_size - offset)
            buffer = fh.read(min(remaining_size, buf_size))
            remaining_size -= buf_size
            lines = buffer.split('\n')
            # the first line of the buffer is probably not a complete line so
            # we'll save it and append it to the last line of the next buffer
            # we read
            if segment is not None:
                # if the previous chunk starts right from the beginning of line
                # do not concact the segment to the last line of new chunk
                # instead, yield the segment first 
                if buffer[-1] is not '\n':
                    lines[-1] += segment
                else:
                    yield segment
            segment = lines[0]
            for index in range(len(lines) - 1, 0, -1):
                if len(lines[index]):
                    yield lines[index]
        # Don't yield None if the file was empty
        if segment is not None:
            yield segment

#3


14  

How about something like this:

比如这样:

import os


def readlines_reverse(filename):
    with open(filename) as qfile:
        qfile.seek(0, os.SEEK_END)
        position = qfile.tell()
        line = ''
        while position >= 0:
            qfile.seek(position)
            next_char = qfile.read(1)
            if next_char == "\n":
                yield line[::-1]
                line = ''
            else:
                line += next_char
            position -= 1
        yield line[::-1]


if __name__ == '__main__':
    for qline in readlines_reverse(raw_input()):
        print qline

Since the file is read character by character in reverse order, it will work even on very large files, as long as individual lines fit into memory.

由于该文件以相反的顺序读取字符,因此即使是在非常大的文件中也会起作用,只要单个行适合内存。

#4


8  

for line in reversed(open("file").readlines()):
    print line.rstrip()

If you are on linux, you can use tac command.

如果您在linux上,您可以使用tac命令。

$ tac file

2 recipes you can find in ActiveState here and here

你可以在这里和这里找到两个食谱。

#5


8  

import re

def filerev(somefile, buffer=0x20000):
  somefile.seek(0, os.SEEK_END)
  size = somefile.tell()
  lines = ['']
  rem = size % buffer
  pos = max(0, (size // buffer - 1) * buffer)
  while pos >= 0:
    somefile.seek(pos, os.SEEK_SET)
    data = somefile.read(rem + buffer) + lines[0]
    rem = 0
    lines = re.findall('[^\n]*\n?', data)
    ix = len(lines) - 2
    while ix > 0:
      yield lines[ix]
      ix -= 1
    pos -= buffer
  else:
    yield lines[0]

with open(sys.argv[1], 'r') as f:
  for line in filerev(f):
    sys.stdout.write(line)

#6


7  

You can also use python module file_read_backwards.

您还可以使用python模块file_read_backwards。

After installing it, via pip install file_read_backwards (v1.2.1), you can read the entire file backwards (line-wise) in a memory efficient manner via:

在安装它之后,通过pip安装file_read_backwards (v1.2.1),您可以通过以下方式将整个文件向后(行-wise)读取:

#!/usr/bin/env python2.7

from file_read_backwards import FileReadBackwards

with FileReadBackwards("/path/to/file", encoding="utf-8") as frb:
    for l in frb:
         print l

It supports "utf-8","latin-1", and "ascii" encodings.

它支持“utf-8”、“latin-1”和“ascii”编码。

Support is also available for python3. Further documentation can be found at http://file-read-backwards.readthedocs.io/en/latest/readme.html

对python3也有支持。进一步的文档可以在http://fileread-backwards.readthedocs.io/en/latest/readme .html中找到。

#7


2  

Here you can find my my implementation, you can limit the ram usage by changing the "buffer" variable, there is a bug that the program prints an empty line in the beginning.

在这里,您可以找到我的实现,您可以通过更改“缓冲区”变量来限制ram的使用,有一个bug,程序在开始时打印了一条空行。

And also ram usage may be increase if there is no new lines for more than buffer bytes, "leak" variable will increase until seeing a new line ("\n").

而且,如果没有新的行用于缓冲区字节,那么内存使用量也会增加,“泄漏”变量将增加,直到看到一条新行(“\n”)。

This is also working for 16 GB files which is bigger then my total memory.

这也适用于16gb的文件,它比我的内存大。

import os,sys
buffer = 1024*1024 # 1MB
f = open(sys.argv[1])
f.seek(0, os.SEEK_END)
filesize = f.tell()

division, remainder = divmod(filesize, buffer)
line_leak=''

for chunk_counter in range(1,division + 2):
    if division - chunk_counter < 0:
        f.seek(0, os.SEEK_SET)
        chunk = f.read(remainder)
    elif division - chunk_counter >= 0:
        f.seek(-(buffer*chunk_counter), os.SEEK_END)
        chunk = f.read(buffer)

    chunk_lines_reversed = list(reversed(chunk.split('\n')))
    if line_leak: # add line_leak from previous chunk to beginning
        chunk_lines_reversed[0] += line_leak

    # after reversed, save the leakedline for next chunk iteration
    line_leak = chunk_lines_reversed.pop()

    if chunk_lines_reversed:
        print "\n".join(chunk_lines_reversed)
    # print the last leaked line
    if division - chunk_counter < 0:
        print line_leak

#8


2  

a simple function to create a second file reversed (linux only):

一个简单的函数来创建第二个文件(只有linux):

import os
def tac(file1, file2):
     print(os.system('tac %s > %s' % (file1,file2)))

how to use

如何使用

tac('ordered.csv', 'reversed.csv')
f = open('reversed.csv')

#9


1  

Thanks for the answer @srohde. It has a small bug checking for newline character with 'is' operator, and I could not comment on the answer with 1 reputation. Also I'd like to manage file open outside because that enables me to embed my ramblings for luigi tasks.

谢谢你的回答@srohde。它有一个小的bug检查新行字符与'is'操作符,我不能评论的答案有1个信誉。另外,我还想在外部管理文件,因为这使我能够嵌入到luigi任务的漫游。

What I needed to change has the form:

我需要改变的是:

with open(filename) as fp:
    for line in fp:
        #print line,  # contains new line
        print '>{}<'.format(line)

I'd love to change to:

我想改一下:

with open(filename) as fp:
    for line in reversed_fp_iter(fp, 4):
        #print line,  # contains new line
        print '>{}<'.format(line)

Here is a modified answer that wants a file handle and keeps newlines:

这里有一个修改后的答案,它想要一个文件句柄,并保持新行:

def reversed_fp_iter(fp, buf_size=8192):
    """a generator that returns the lines of a file in reverse order
    ref: https://*.com/a/23646049/8776239
    """
    segment = None  # holds possible incomplete segment at the beginning of the buffer
    offset = 0
    fp.seek(0, os.SEEK_END)
    file_size = remaining_size = fp.tell()
    while remaining_size > 0:
        offset = min(file_size, offset + buf_size)
        fp.seek(file_size - offset)
        buffer = fp.read(min(remaining_size, buf_size))
        remaining_size -= buf_size
        lines = buffer.splitlines(True)
        # the first line of the buffer is probably not a complete line so
        # we'll save it and append it to the last line of the next buffer
        # we read
        if segment is not None:
            # if the previous chunk starts right from the beginning of line
            # do not concat the segment to the last line of new chunk
            # instead, yield the segment first
            if buffer[-1] == '\n':
                #print 'buffer ends with newline'
                yield segment
            else:
                lines[-1] += segment
                #print 'enlarged last line to >{}<, len {}'.format(lines[-1], len(lines))
        segment = lines[0]
        for index in range(len(lines) - 1, 0, -1):
            if len(lines[index]):
                yield lines[index]
    # Don't yield None if the file was empty
    if segment is not None:
        yield segment

#10


0  

def reverse_lines(filename):
    y=open(filename).readlines()
    return y[::-1]

#11


0  

Always use with when working with files as it handles everything for you:

在处理文件时,要经常使用它,因为它为您处理所有事情:

with open('filename', 'r') as f:
    for line in reversed(f.readlines()):
        print line

Or in Python 3:

或在Python 3:

with open('filename', 'r') as f:
    for line in reversed(list(f.readlines())):
        print(line)

#12


0  

If you are concerned about file size / memory usage, memory-mapping the file and scanning backwards for newlines is a solution:

如果您关心文件大小/内存使用情况,内存映射文件和向后扫描换行是一个解决方案:

How to search for a string in text files?

如何在文本文件中搜索字符串?

#13


-2  

I had to do this some time ago and used the below code. It pipes to the shell. I am afraid i do not have the complete script anymore. If you are on a unixish operating system, you can use "tac", however on e.g. Mac OSX tac command does not work, use tail -r. The below code snippet tests for which platform you're on, and adjusts the command accordingly

我在一段时间之前就必须这样做,并使用下面的代码。它管壳。恐怕我已经没有完整的脚本了。如果您使用的是unixish操作系统,您可以使用“tac”,但是,例如Mac OSX tac命令不工作,使用tail -r。下面的代码片段测试了您所使用的平台,并相应地调整命令。

# We need a command to reverse the line order of the file. On Linux this
# is 'tac', on OSX it is 'tail -r'
# 'tac' is not supported on osx, 'tail -r' is not supported on linux.

if sys.platform == "darwin":
    command += "|tail -r"
elif sys.platform == "linux2":
    command += "|tac"
else:
    raise EnvironmentError('Platform %s not supported' % sys.platform)

#1


55  

for line in reversed(open("filename").readlines()):
    print line.rstrip()

And in Python 3:

在Python 3:

for line in reversed(list(open("filename"))):
    print(line.rstrip())

#2


93  

A correct, efficient answer written as a generator.

一个正确的、有效的答案。

import os

def reverse_readline(filename, buf_size=8192):
    """a generator that returns the lines of a file in reverse order"""
    with open(filename) as fh:
        segment = None
        offset = 0
        fh.seek(0, os.SEEK_END)
        file_size = remaining_size = fh.tell()
        while remaining_size > 0:
            offset = min(file_size, offset + buf_size)
            fh.seek(file_size - offset)
            buffer = fh.read(min(remaining_size, buf_size))
            remaining_size -= buf_size
            lines = buffer.split('\n')
            # the first line of the buffer is probably not a complete line so
            # we'll save it and append it to the last line of the next buffer
            # we read
            if segment is not None:
                # if the previous chunk starts right from the beginning of line
                # do not concact the segment to the last line of new chunk
                # instead, yield the segment first 
                if buffer[-1] is not '\n':
                    lines[-1] += segment
                else:
                    yield segment
            segment = lines[0]
            for index in range(len(lines) - 1, 0, -1):
                if len(lines[index]):
                    yield lines[index]
        # Don't yield None if the file was empty
        if segment is not None:
            yield segment

#3


14  

How about something like this:

比如这样:

import os


def readlines_reverse(filename):
    with open(filename) as qfile:
        qfile.seek(0, os.SEEK_END)
        position = qfile.tell()
        line = ''
        while position >= 0:
            qfile.seek(position)
            next_char = qfile.read(1)
            if next_char == "\n":
                yield line[::-1]
                line = ''
            else:
                line += next_char
            position -= 1
        yield line[::-1]


if __name__ == '__main__':
    for qline in readlines_reverse(raw_input()):
        print qline

Since the file is read character by character in reverse order, it will work even on very large files, as long as individual lines fit into memory.

由于该文件以相反的顺序读取字符,因此即使是在非常大的文件中也会起作用,只要单个行适合内存。

#4


8  

for line in reversed(open("file").readlines()):
    print line.rstrip()

If you are on linux, you can use tac command.

如果您在linux上,您可以使用tac命令。

$ tac file

2 recipes you can find in ActiveState here and here

你可以在这里和这里找到两个食谱。

#5


8  

import re

def filerev(somefile, buffer=0x20000):
  somefile.seek(0, os.SEEK_END)
  size = somefile.tell()
  lines = ['']
  rem = size % buffer
  pos = max(0, (size // buffer - 1) * buffer)
  while pos >= 0:
    somefile.seek(pos, os.SEEK_SET)
    data = somefile.read(rem + buffer) + lines[0]
    rem = 0
    lines = re.findall('[^\n]*\n?', data)
    ix = len(lines) - 2
    while ix > 0:
      yield lines[ix]
      ix -= 1
    pos -= buffer
  else:
    yield lines[0]

with open(sys.argv[1], 'r') as f:
  for line in filerev(f):
    sys.stdout.write(line)

#6


7  

You can also use python module file_read_backwards.

您还可以使用python模块file_read_backwards。

After installing it, via pip install file_read_backwards (v1.2.1), you can read the entire file backwards (line-wise) in a memory efficient manner via:

在安装它之后,通过pip安装file_read_backwards (v1.2.1),您可以通过以下方式将整个文件向后(行-wise)读取:

#!/usr/bin/env python2.7

from file_read_backwards import FileReadBackwards

with FileReadBackwards("/path/to/file", encoding="utf-8") as frb:
    for l in frb:
         print l

It supports "utf-8","latin-1", and "ascii" encodings.

它支持“utf-8”、“latin-1”和“ascii”编码。

Support is also available for python3. Further documentation can be found at http://file-read-backwards.readthedocs.io/en/latest/readme.html

对python3也有支持。进一步的文档可以在http://fileread-backwards.readthedocs.io/en/latest/readme .html中找到。

#7


2  

Here you can find my my implementation, you can limit the ram usage by changing the "buffer" variable, there is a bug that the program prints an empty line in the beginning.

在这里,您可以找到我的实现,您可以通过更改“缓冲区”变量来限制ram的使用,有一个bug,程序在开始时打印了一条空行。

And also ram usage may be increase if there is no new lines for more than buffer bytes, "leak" variable will increase until seeing a new line ("\n").

而且,如果没有新的行用于缓冲区字节,那么内存使用量也会增加,“泄漏”变量将增加,直到看到一条新行(“\n”)。

This is also working for 16 GB files which is bigger then my total memory.

这也适用于16gb的文件,它比我的内存大。

import os,sys
buffer = 1024*1024 # 1MB
f = open(sys.argv[1])
f.seek(0, os.SEEK_END)
filesize = f.tell()

division, remainder = divmod(filesize, buffer)
line_leak=''

for chunk_counter in range(1,division + 2):
    if division - chunk_counter < 0:
        f.seek(0, os.SEEK_SET)
        chunk = f.read(remainder)
    elif division - chunk_counter >= 0:
        f.seek(-(buffer*chunk_counter), os.SEEK_END)
        chunk = f.read(buffer)

    chunk_lines_reversed = list(reversed(chunk.split('\n')))
    if line_leak: # add line_leak from previous chunk to beginning
        chunk_lines_reversed[0] += line_leak

    # after reversed, save the leakedline for next chunk iteration
    line_leak = chunk_lines_reversed.pop()

    if chunk_lines_reversed:
        print "\n".join(chunk_lines_reversed)
    # print the last leaked line
    if division - chunk_counter < 0:
        print line_leak

#8


2  

a simple function to create a second file reversed (linux only):

一个简单的函数来创建第二个文件(只有linux):

import os
def tac(file1, file2):
     print(os.system('tac %s > %s' % (file1,file2)))

how to use

如何使用

tac('ordered.csv', 'reversed.csv')
f = open('reversed.csv')

#9


1  

Thanks for the answer @srohde. It has a small bug checking for newline character with 'is' operator, and I could not comment on the answer with 1 reputation. Also I'd like to manage file open outside because that enables me to embed my ramblings for luigi tasks.

谢谢你的回答@srohde。它有一个小的bug检查新行字符与'is'操作符,我不能评论的答案有1个信誉。另外,我还想在外部管理文件,因为这使我能够嵌入到luigi任务的漫游。

What I needed to change has the form:

我需要改变的是:

with open(filename) as fp:
    for line in fp:
        #print line,  # contains new line
        print '>{}<'.format(line)

I'd love to change to:

我想改一下:

with open(filename) as fp:
    for line in reversed_fp_iter(fp, 4):
        #print line,  # contains new line
        print '>{}<'.format(line)

Here is a modified answer that wants a file handle and keeps newlines:

这里有一个修改后的答案,它想要一个文件句柄,并保持新行:

def reversed_fp_iter(fp, buf_size=8192):
    """a generator that returns the lines of a file in reverse order
    ref: https://*.com/a/23646049/8776239
    """
    segment = None  # holds possible incomplete segment at the beginning of the buffer
    offset = 0
    fp.seek(0, os.SEEK_END)
    file_size = remaining_size = fp.tell()
    while remaining_size > 0:
        offset = min(file_size, offset + buf_size)
        fp.seek(file_size - offset)
        buffer = fp.read(min(remaining_size, buf_size))
        remaining_size -= buf_size
        lines = buffer.splitlines(True)
        # the first line of the buffer is probably not a complete line so
        # we'll save it and append it to the last line of the next buffer
        # we read
        if segment is not None:
            # if the previous chunk starts right from the beginning of line
            # do not concat the segment to the last line of new chunk
            # instead, yield the segment first
            if buffer[-1] == '\n':
                #print 'buffer ends with newline'
                yield segment
            else:
                lines[-1] += segment
                #print 'enlarged last line to >{}<, len {}'.format(lines[-1], len(lines))
        segment = lines[0]
        for index in range(len(lines) - 1, 0, -1):
            if len(lines[index]):
                yield lines[index]
    # Don't yield None if the file was empty
    if segment is not None:
        yield segment

#10


0  

def reverse_lines(filename):
    y=open(filename).readlines()
    return y[::-1]

#11


0  

Always use with when working with files as it handles everything for you:

在处理文件时,要经常使用它,因为它为您处理所有事情:

with open('filename', 'r') as f:
    for line in reversed(f.readlines()):
        print line

Or in Python 3:

或在Python 3:

with open('filename', 'r') as f:
    for line in reversed(list(f.readlines())):
        print(line)

#12


0  

If you are concerned about file size / memory usage, memory-mapping the file and scanning backwards for newlines is a solution:

如果您关心文件大小/内存使用情况,内存映射文件和向后扫描换行是一个解决方案:

How to search for a string in text files?

如何在文本文件中搜索字符串?

#13


-2  

I had to do this some time ago and used the below code. It pipes to the shell. I am afraid i do not have the complete script anymore. If you are on a unixish operating system, you can use "tac", however on e.g. Mac OSX tac command does not work, use tail -r. The below code snippet tests for which platform you're on, and adjusts the command accordingly

我在一段时间之前就必须这样做,并使用下面的代码。它管壳。恐怕我已经没有完整的脚本了。如果您使用的是unixish操作系统,您可以使用“tac”,但是,例如Mac OSX tac命令不工作,使用tail -r。下面的代码片段测试了您所使用的平台,并相应地调整命令。

# We need a command to reverse the line order of the file. On Linux this
# is 'tac', on OSX it is 'tail -r'
# 'tac' is not supported on osx, 'tail -r' is not supported on linux.

if sys.platform == "darwin":
    command += "|tail -r"
elif sys.platform == "linux2":
    command += "|tac"
else:
    raise EnvironmentError('Platform %s not supported' % sys.platform)