使用Apache POI从Excel格式化HTML格式化单元格值

时间:2021-11-30 15:40:16

I am using apache POI to read an excel document. To say the least, it is able to serve my purpose as of now. But one thing where I am getting struck is extracting the value of cell as HTML.

我正在使用apache POI来阅读excel文档。至少可以说,它至今可以满足我的目的。但是我受到打击的一件事是将单元格的值提取为HTML。

I have one cell wherein user will enter some string and apply some formatting(like bullets/numbers/bold/italic) etc.

我有一个单元格,其中用户将输入一些字符串并应用一些格式(如子弹/数字/粗体/斜体)等。

SO when I read it the content should be in HTML format and not a plain string format as given by POI.

因此,当我阅读它时,内容应该是HTML格式,而不是POI给出的普通字符串格式。

I have almost gone through the entire POI API but not able to find anyone. I want to remain the formatting of just one particular column and not the entire excel. By column I mean, the text which is entered in that column. I want that text as HTML text.

我几乎已经完成了整个POI API但却无法找到任何人。我想保留一个特定列的格式,而不是整个excel。按列我的意思是,在该列中输入的文本。我希望该文本为HTML文本。

Explored and used Apache Tika also. However as I understand it can only get me the text but not the formatting of the text.

探索和使用Apache Tika也。但据我所知它只能得到文本而不是文本的格式。

Please someone guide me. I am running out of options.

请有人指导我。我的选项用完了。

Suppose I wrote My name is Angel and Demon in Excel.

假设我在Excel中写了我的名字是天使和恶魔。

The output I should get in Java is My name is <b>Angel</b> and <i>Demon</i>

我应该用Java获得的输出是我的名字是 Angel 和 Demon

1 个解决方案

#1


3  

I've paste this as unicode to cell A1 of xls file:

我将此作为unicode粘贴到xls文件的单元格A1:

<html><p>This is a test. Will this text be <b>bold</b> or <i>italic</i></p></html>

This html line produce this:

这个html行产生了这个:

This is a test. Will this text be bold or italic

这是一个测试。这个文本是粗体还是斜体

My code:

我的代码:

public class ExcelWithHtml {
    // <html><p>This is a test. Will this text be <b>bold</b> or
    // <i>italic</i></p></html>

    public static void main(String[] args) throws FileNotFoundException,
            IOException {
        new ExcelWithHtml()
                .readFirstCellOfXSSF("/Users/rcacheira/testeHtml.xlsx");
    }

    boolean inBold = false;
    boolean inItalic = false;

    public void readFirstCellOfXSSF(String filePathName)
            throws FileNotFoundException, IOException {
        FileInputStream fis = new FileInputStream(filePathName);
        XSSFWorkbook wb = new XSSFWorkbook(fis);
        XSSFSheet sheet = wb.getSheetAt(0);

        String cellHtml = getHtmlFormatedCellValueFromSheet(sheet, "A1");

        System.out.println(cellHtml);

        fis.close();
    }

    public String getHtmlFormatedCellValueFromSheet(XSSFSheet sheet,
            String cellName) {

        CellReference cellReference = new CellReference(cellName);
        XSSFRow row = sheet.getRow(cellReference.getRow());
        XSSFCell cell = row.getCell(cellReference.getCol());

        XSSFRichTextString cellText = cell.getRichStringCellValue();

        String htmlCode = "";
        // htmlCode = "<html>";

        for (int i = 0; i < cellText.numFormattingRuns(); i++) {
            try {
                htmlCode += getFormatFromFont(cellText.getFontAtIndex(i));
            } catch (NullPointerException ex) {
            }
            try {
                htmlCode += getFormatFromFont(cellText
                        .getFontOfFormattingRun(i));
            } catch (NullPointerException ex) {
            }

            int indexStart = cellText.getIndexOfFormattingRun(i);
            int indexEnd = indexStart + cellText.getLengthOfFormattingRun(i);

            htmlCode += cellText.getString().substring(indexStart, indexEnd);
        }

        if (inItalic) {
            htmlCode += "</i>";
            inItalic = false;
        }
        if (inBold) {
            htmlCode += "</b>";
            inBold = false;
        }

        // htmlCode += "</html>";
        return htmlCode;

    }

    private String getFormatFromFont(XSSFFont font) {
        String formatHtmlCode = "";
        if (font.getItalic() && !inItalic) {
            formatHtmlCode += "<i>";
            inItalic = true;
        } else if (!font.getItalic() && inItalic) {
            formatHtmlCode += "</i>";
            inItalic = false;
        }

        if (font.getBold() && !inBold) {
            formatHtmlCode += "<b>";
            inBold = true;
        } else if (!font.getBold() && inBold) {
            formatHtmlCode += "</b>";
            inBold = false;
        }

        return formatHtmlCode;
    }

}

My output:

我的输出:

This is a test. Will this text be <b>bold</b> or <i>italic</i>

I think it is what you want, i'm only show you the possibilities, i'm not using the best code practices, i'm just programming fast to produce an output.

我认为这是你想要的,我只是告诉你可能性,我没有使用最好的代码实践,我只是快速编程以产生输出。

#1


3  

I've paste this as unicode to cell A1 of xls file:

我将此作为unicode粘贴到xls文件的单元格A1:

<html><p>This is a test. Will this text be <b>bold</b> or <i>italic</i></p></html>

This html line produce this:

这个html行产生了这个:

This is a test. Will this text be bold or italic

这是一个测试。这个文本是粗体还是斜体

My code:

我的代码:

public class ExcelWithHtml {
    // <html><p>This is a test. Will this text be <b>bold</b> or
    // <i>italic</i></p></html>

    public static void main(String[] args) throws FileNotFoundException,
            IOException {
        new ExcelWithHtml()
                .readFirstCellOfXSSF("/Users/rcacheira/testeHtml.xlsx");
    }

    boolean inBold = false;
    boolean inItalic = false;

    public void readFirstCellOfXSSF(String filePathName)
            throws FileNotFoundException, IOException {
        FileInputStream fis = new FileInputStream(filePathName);
        XSSFWorkbook wb = new XSSFWorkbook(fis);
        XSSFSheet sheet = wb.getSheetAt(0);

        String cellHtml = getHtmlFormatedCellValueFromSheet(sheet, "A1");

        System.out.println(cellHtml);

        fis.close();
    }

    public String getHtmlFormatedCellValueFromSheet(XSSFSheet sheet,
            String cellName) {

        CellReference cellReference = new CellReference(cellName);
        XSSFRow row = sheet.getRow(cellReference.getRow());
        XSSFCell cell = row.getCell(cellReference.getCol());

        XSSFRichTextString cellText = cell.getRichStringCellValue();

        String htmlCode = "";
        // htmlCode = "<html>";

        for (int i = 0; i < cellText.numFormattingRuns(); i++) {
            try {
                htmlCode += getFormatFromFont(cellText.getFontAtIndex(i));
            } catch (NullPointerException ex) {
            }
            try {
                htmlCode += getFormatFromFont(cellText
                        .getFontOfFormattingRun(i));
            } catch (NullPointerException ex) {
            }

            int indexStart = cellText.getIndexOfFormattingRun(i);
            int indexEnd = indexStart + cellText.getLengthOfFormattingRun(i);

            htmlCode += cellText.getString().substring(indexStart, indexEnd);
        }

        if (inItalic) {
            htmlCode += "</i>";
            inItalic = false;
        }
        if (inBold) {
            htmlCode += "</b>";
            inBold = false;
        }

        // htmlCode += "</html>";
        return htmlCode;

    }

    private String getFormatFromFont(XSSFFont font) {
        String formatHtmlCode = "";
        if (font.getItalic() && !inItalic) {
            formatHtmlCode += "<i>";
            inItalic = true;
        } else if (!font.getItalic() && inItalic) {
            formatHtmlCode += "</i>";
            inItalic = false;
        }

        if (font.getBold() && !inBold) {
            formatHtmlCode += "<b>";
            inBold = true;
        } else if (!font.getBold() && inBold) {
            formatHtmlCode += "</b>";
            inBold = false;
        }

        return formatHtmlCode;
    }

}

My output:

我的输出:

This is a test. Will this text be <b>bold</b> or <i>italic</i>

I think it is what you want, i'm only show you the possibilities, i'm not using the best code practices, i'm just programming fast to produce an output.

我认为这是你想要的,我只是告诉你可能性,我没有使用最好的代码实践,我只是快速编程以产生输出。