如何在不覆盖数据(使用熊猫)的情况下写入现有的excel文件?

时间:2021-04-27 20:27:20

I use pandas to write to excel file in the following fashion:

我用熊猫写excel文件的方式如下:

import pandas

writer = pandas.ExcelWriter('Masterfile.xlsx') 

data_filtered.to_excel(writer, "Main", cols=['Diff1', 'Diff2'])

writer.save()

Masterfile.xlsx already consists of number of different tabs.

Masterfile。xlsx已经包含了许多不同的选项卡。

Pandas correctly writes to "Main" sheet, unfortunately it also deletes all other tabs.

熊猫正确地写入“主”页,不幸的是,它也删除所有其他的选项卡。

7 个解决方案

#1


89  

Pandas docs says it uses openpyxl for xlsx files. Quick look through the code in ExcelWriter gives a clue that something like this might work out:

熊猫医生说它使用openpyxl来处理xlsx文件。快速浏览ExcelWriter中的代码,就会发现类似这样的事情可能会发生:

import pandas
from openpyxl import load_workbook

book = load_workbook('Masterfile.xlsx')
writer = pandas.ExcelWriter('Masterfile.xlsx', engine='openpyxl') 
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)

data_filtered.to_excel(writer, "Main", cols=['Diff1', 'Diff2'])

writer.save()

#2


9  

With openpyxlversion 2.4.0 and pandasversion 0.19.2, the process @ski came up with gets a bit simpler:

使用openpyxlversion 2.4.0和pandasversion 0.19.2, @ski提出的过程变得简单了一些:

import pandas
from openpyxl import load_workbook

with pandas.ExcelWriter('Masterfile.xlsx', engine='openpyxl') as writer:
    writer.book = load_workbook('Masterfile.xlsx')
    data_filtered.to_excel(writer, "Main", cols=['Diff1', 'Diff2'])
#That's it!

#3


8  

Old question, but I am guessing some people still search for this - so...

老问题,但是我猜有些人仍然在寻找这个-所以…

I find this method nice because all worksheets are loaded into a dictionary of sheet name and dataframe pairs, created by pandas with the sheetname=None option. It is simple to add, delete or modify worksheets between reading the spreadsheet into the dict format and writing it back from the dict. For me the xlsxwriter works better than openpyxl for this particular task in terms of speed and format.

我发现这个方法很好,因为所有的工作表都被加载到一个表名和dataframe对字典中,由熊猫用sheetname=None选项创建。在将电子表格读取到dict格式并将其写入到dict格式之间添加、删除或修改工作表非常简单。

Note: future versions of pandas (0.21.0+) will change the "sheetname" parameter to "sheet_name".

注意:未来的熊猫版本(0.21.0+)将把“sheetname”参数改为“sheet_name”。

# read a single or multi-sheet excel file
# (returns dict of sheetname(s), dataframe(s))
ws_dict = pd.read_excel(excel_file_path,
                        sheetname=None)

# all worksheets are accessible as dataframes.

# easy to change a worksheet as a dataframe:
mod_df = ws_dict['existing_worksheet']

# do work on mod_df...then reassign
ws_dict['existing_worksheet'] = mod_df

# add a dataframe to the workbook as a new worksheet with
# ws name, df as dict key, value:
ws_dict['new_worksheet'] = some_other_dataframe

# when done, write dictionary back to excel...
# xlsxwriter honors datetime and date formats
# (only included as example)...
with pd.ExcelWriter(excel_file_path,
                    engine='xlsxwriter',
                    datetime_format='yyyy-mm-dd',
                    date_format='yyyy-mm-dd') as writer:

    for ws_name, df_sheet in ws_dict.items():
        df_sheet.to_excel(writer, sheet_name=ws_name)

For the example in the 2013 question:

例如2013年的问题:

ws_dict = pd.read_excel('Masterfile.xlsx',
                        sheetname=None)

ws_dict['Main'] = data_filtered[['Diff1', 'Diff2']]

with pd.ExcelWriter('Masterfile.xlsx',
                    engine='xlsxwriter') as writer:

    for ws_name, df_sheet in ws_dict.items():
        df_sheet.to_excel(writer, sheet_name=ws_name)

#4


6  

Here is a helper function:

这里有一个辅助函数:

def append_df_to_excel(filename, df, sheet_name='Sheet1', startrow=None,
                       truncate_sheet=False, 
                       **to_excel_kwargs):
    """
    Append a DataFrame [df] to existing Excel file [filename]
    into [sheet_name] Sheet.
    If [filename] doesn't exist, then this function will create it.

    Parameters:
      filename : File path or existing ExcelWriter
                 (Example: '/path/to/file.xlsx')
      df : dataframe to save to workbook
      sheet_name : Name of sheet which will contain DataFrame.
                   (default: 'Sheet1')
      startrow : upper left cell row to dump data frame.
                 Per default (startrow=None) calculate the last row
                 in the existing DF and write to the next row...
      truncate_sheet : truncate (remove and recreate) [sheet_name]
                       before writing DataFrame to Excel file
      to_excel_kwargs : arguments which will be passed to `DataFrame.to_excel()`
                        [can be dictionary]

    Returns: None
    """
    from openpyxl import load_workbook

    # ignore [engine] parameter if it was passed
    if 'engine' in to_excel_kwargs:
        to_excel_kwargs.pop('engine')

    writer = pd.ExcelWriter(filename, engine='openpyxl')

    try:
        # try to open an existing workbook
        writer.book = load_workbook(filename)

        # get the last row in the existing Excel sheet
        # if it was not specified explicitly
        if startrow is None and sheet_name in writer.book.sheetnames:
            startrow = writer.book[sheet_name].max_row

        # truncate sheet
        if truncate_sheet and sheet_name in writer.book.sheetnames:
            # index of [sheet_name] sheet
            idx = writer.book.sheetnames.index(sheet_name)
            # remove [sheet_name]
            writer.book.remove(writer.book.worksheets[idx])
            # create an empty sheet [sheet_name] using old index
            writer.book.create_sheet(sheet_name, idx)

        # copy existing sheets
        writer.sheets = {ws.title:ws for ws in writer.book.worksheets}
    except FileNotFoundError:
        # file does not exist yet, we will create it
        pass

    if startrow is None:
        startrow = 0

    # write out the new sheet
    df.to_excel(writer, sheet_name, startrow=startrow, **to_excel_kwargs)

    # save the workbook
    writer.save()

NOTE: for Pandas < 0.21.0, replace sheet_name with sheetname!

注意:对于熊猫< 0.21.0,用sheetname替换sheet_name !

Usage examples:

使用例子:

append_df_to_excel('d:/temp/test.xlsx', df)

append_df_to_excel('d:/temp/test.xlsx', df, header=None, index=False)

append_df_to_excel('d:/temp/test.xlsx', df, sheet_name='Sheet2', index=False)

append_df_to_excel('d:/temp/test.xlsx', df, sheet_name='Sheet2', index=False, startrow=25)

#5


5  

I know this is an older thread, but this is the first item you find when searching, and the above solutions don't work if you need to retain charts in a workbook that you already have created. In that case, xlwings is a better option - it allows you to write to the excel book and keeps the charts/chart data.

我知道这是一个旧的线程,但这是搜索时发现的第一个项,如果您需要在已经创建的工作簿中保留图表,那么上面的解决方案将不起作用。在这种情况下,xlwings是一个更好的选择——它允许您编写excel,并保存图表数据。

simple example:

简单的例子:

import xlwings as xw
import pandas as pd

#create DF
months = ['2017-01','2017-02','2017-03','2017-04','2017-05','2017-06','2017-07','2017-08','2017-09','2017-10','2017-11','2017-12']
value1 = [x * 5+5 for x in range(len(months))]
df = pd.DataFrame(value1, index = months, columns = ['value1'])
df['value2'] = df['value1']+5
df['value3'] = df['value2']+5

#load workbook that has a chart in it
wb = xw.Book('C:\\data\\bookwithChart.xlsx')

ws = wb.sheets['chartData']

ws.range('A1').options(index=False).value = df

wb = xw.Book('C:\\data\\bookwithChart_updated.xlsx')

xw.apps[0].quit()

#6


1  

def append_sheet_to_master(self, master_file_path, current_file_path, sheet_name):
    try:
        master_book = load_workbook(master_file_path)
        master_writer = pandas.ExcelWriter(master_file_path, engine='openpyxl')
        master_writer.book = master_book
        master_writer.sheets = dict((ws.title, ws) for ws in master_book.worksheets)
        current_frames = pandas.ExcelFile(current_file_path).parse(pandas.ExcelFile(current_file_path).sheet_names[0],
                                                               header=None,
                                                               index_col=None)
        current_frames.to_excel(master_writer, sheet_name, index=None, header=False)

        master_writer.save()
    except Exception as e:
        raise e

This works perfectly fine only thing is that formatting of the master file(file to which we add new sheet) is lost.

这非常好,唯一的问题是丢失了主文件(添加新表的文件)的格式。

#7


0  

writer = pd.ExcelWriter('prueba1.xlsx'engine='openpyxl',keep_date_col=True)

The "keep_date_col" hope help you

“keep_date_col”希望对您有所帮助

#1


89  

Pandas docs says it uses openpyxl for xlsx files. Quick look through the code in ExcelWriter gives a clue that something like this might work out:

熊猫医生说它使用openpyxl来处理xlsx文件。快速浏览ExcelWriter中的代码,就会发现类似这样的事情可能会发生:

import pandas
from openpyxl import load_workbook

book = load_workbook('Masterfile.xlsx')
writer = pandas.ExcelWriter('Masterfile.xlsx', engine='openpyxl') 
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)

data_filtered.to_excel(writer, "Main", cols=['Diff1', 'Diff2'])

writer.save()

#2


9  

With openpyxlversion 2.4.0 and pandasversion 0.19.2, the process @ski came up with gets a bit simpler:

使用openpyxlversion 2.4.0和pandasversion 0.19.2, @ski提出的过程变得简单了一些:

import pandas
from openpyxl import load_workbook

with pandas.ExcelWriter('Masterfile.xlsx', engine='openpyxl') as writer:
    writer.book = load_workbook('Masterfile.xlsx')
    data_filtered.to_excel(writer, "Main", cols=['Diff1', 'Diff2'])
#That's it!

#3


8  

Old question, but I am guessing some people still search for this - so...

老问题,但是我猜有些人仍然在寻找这个-所以…

I find this method nice because all worksheets are loaded into a dictionary of sheet name and dataframe pairs, created by pandas with the sheetname=None option. It is simple to add, delete or modify worksheets between reading the spreadsheet into the dict format and writing it back from the dict. For me the xlsxwriter works better than openpyxl for this particular task in terms of speed and format.

我发现这个方法很好,因为所有的工作表都被加载到一个表名和dataframe对字典中,由熊猫用sheetname=None选项创建。在将电子表格读取到dict格式并将其写入到dict格式之间添加、删除或修改工作表非常简单。

Note: future versions of pandas (0.21.0+) will change the "sheetname" parameter to "sheet_name".

注意:未来的熊猫版本(0.21.0+)将把“sheetname”参数改为“sheet_name”。

# read a single or multi-sheet excel file
# (returns dict of sheetname(s), dataframe(s))
ws_dict = pd.read_excel(excel_file_path,
                        sheetname=None)

# all worksheets are accessible as dataframes.

# easy to change a worksheet as a dataframe:
mod_df = ws_dict['existing_worksheet']

# do work on mod_df...then reassign
ws_dict['existing_worksheet'] = mod_df

# add a dataframe to the workbook as a new worksheet with
# ws name, df as dict key, value:
ws_dict['new_worksheet'] = some_other_dataframe

# when done, write dictionary back to excel...
# xlsxwriter honors datetime and date formats
# (only included as example)...
with pd.ExcelWriter(excel_file_path,
                    engine='xlsxwriter',
                    datetime_format='yyyy-mm-dd',
                    date_format='yyyy-mm-dd') as writer:

    for ws_name, df_sheet in ws_dict.items():
        df_sheet.to_excel(writer, sheet_name=ws_name)

For the example in the 2013 question:

例如2013年的问题:

ws_dict = pd.read_excel('Masterfile.xlsx',
                        sheetname=None)

ws_dict['Main'] = data_filtered[['Diff1', 'Diff2']]

with pd.ExcelWriter('Masterfile.xlsx',
                    engine='xlsxwriter') as writer:

    for ws_name, df_sheet in ws_dict.items():
        df_sheet.to_excel(writer, sheet_name=ws_name)

#4


6  

Here is a helper function:

这里有一个辅助函数:

def append_df_to_excel(filename, df, sheet_name='Sheet1', startrow=None,
                       truncate_sheet=False, 
                       **to_excel_kwargs):
    """
    Append a DataFrame [df] to existing Excel file [filename]
    into [sheet_name] Sheet.
    If [filename] doesn't exist, then this function will create it.

    Parameters:
      filename : File path or existing ExcelWriter
                 (Example: '/path/to/file.xlsx')
      df : dataframe to save to workbook
      sheet_name : Name of sheet which will contain DataFrame.
                   (default: 'Sheet1')
      startrow : upper left cell row to dump data frame.
                 Per default (startrow=None) calculate the last row
                 in the existing DF and write to the next row...
      truncate_sheet : truncate (remove and recreate) [sheet_name]
                       before writing DataFrame to Excel file
      to_excel_kwargs : arguments which will be passed to `DataFrame.to_excel()`
                        [can be dictionary]

    Returns: None
    """
    from openpyxl import load_workbook

    # ignore [engine] parameter if it was passed
    if 'engine' in to_excel_kwargs:
        to_excel_kwargs.pop('engine')

    writer = pd.ExcelWriter(filename, engine='openpyxl')

    try:
        # try to open an existing workbook
        writer.book = load_workbook(filename)

        # get the last row in the existing Excel sheet
        # if it was not specified explicitly
        if startrow is None and sheet_name in writer.book.sheetnames:
            startrow = writer.book[sheet_name].max_row

        # truncate sheet
        if truncate_sheet and sheet_name in writer.book.sheetnames:
            # index of [sheet_name] sheet
            idx = writer.book.sheetnames.index(sheet_name)
            # remove [sheet_name]
            writer.book.remove(writer.book.worksheets[idx])
            # create an empty sheet [sheet_name] using old index
            writer.book.create_sheet(sheet_name, idx)

        # copy existing sheets
        writer.sheets = {ws.title:ws for ws in writer.book.worksheets}
    except FileNotFoundError:
        # file does not exist yet, we will create it
        pass

    if startrow is None:
        startrow = 0

    # write out the new sheet
    df.to_excel(writer, sheet_name, startrow=startrow, **to_excel_kwargs)

    # save the workbook
    writer.save()

NOTE: for Pandas < 0.21.0, replace sheet_name with sheetname!

注意:对于熊猫< 0.21.0,用sheetname替换sheet_name !

Usage examples:

使用例子:

append_df_to_excel('d:/temp/test.xlsx', df)

append_df_to_excel('d:/temp/test.xlsx', df, header=None, index=False)

append_df_to_excel('d:/temp/test.xlsx', df, sheet_name='Sheet2', index=False)

append_df_to_excel('d:/temp/test.xlsx', df, sheet_name='Sheet2', index=False, startrow=25)

#5


5  

I know this is an older thread, but this is the first item you find when searching, and the above solutions don't work if you need to retain charts in a workbook that you already have created. In that case, xlwings is a better option - it allows you to write to the excel book and keeps the charts/chart data.

我知道这是一个旧的线程,但这是搜索时发现的第一个项,如果您需要在已经创建的工作簿中保留图表,那么上面的解决方案将不起作用。在这种情况下,xlwings是一个更好的选择——它允许您编写excel,并保存图表数据。

simple example:

简单的例子:

import xlwings as xw
import pandas as pd

#create DF
months = ['2017-01','2017-02','2017-03','2017-04','2017-05','2017-06','2017-07','2017-08','2017-09','2017-10','2017-11','2017-12']
value1 = [x * 5+5 for x in range(len(months))]
df = pd.DataFrame(value1, index = months, columns = ['value1'])
df['value2'] = df['value1']+5
df['value3'] = df['value2']+5

#load workbook that has a chart in it
wb = xw.Book('C:\\data\\bookwithChart.xlsx')

ws = wb.sheets['chartData']

ws.range('A1').options(index=False).value = df

wb = xw.Book('C:\\data\\bookwithChart_updated.xlsx')

xw.apps[0].quit()

#6


1  

def append_sheet_to_master(self, master_file_path, current_file_path, sheet_name):
    try:
        master_book = load_workbook(master_file_path)
        master_writer = pandas.ExcelWriter(master_file_path, engine='openpyxl')
        master_writer.book = master_book
        master_writer.sheets = dict((ws.title, ws) for ws in master_book.worksheets)
        current_frames = pandas.ExcelFile(current_file_path).parse(pandas.ExcelFile(current_file_path).sheet_names[0],
                                                               header=None,
                                                               index_col=None)
        current_frames.to_excel(master_writer, sheet_name, index=None, header=False)

        master_writer.save()
    except Exception as e:
        raise e

This works perfectly fine only thing is that formatting of the master file(file to which we add new sheet) is lost.

这非常好,唯一的问题是丢失了主文件(添加新表的文件)的格式。

#7


0  

writer = pd.ExcelWriter('prueba1.xlsx'engine='openpyxl',keep_date_col=True)

The "keep_date_col" hope help you

“keep_date_col”希望对您有所帮助