使用Apache POI从Excel格式化HTML格式化单元格值

时间:2021-11-30 15:40:16

I am using apache POI to read an excel document. To say the least, it is able to serve my purpose as of now. But one thing where I am getting struck is extracting the value of cell as HTML.

我正在使用apache POI来阅读excel文档。至少可以说,它至今可以满足我的目的。但是我受到打击的一件事是将单元格的值提取为HTML。

I have one cell wherein user will enter some string and apply some formatting(like bullets/numbers/bold/italic) etc.


SO when I read it the content should be in HTML format and not a plain string format as given by POI.


I have almost gone through the entire POI API but not able to find anyone. I want to remain the formatting of just one particular column and not the entire excel. By column I mean, the text which is entered in that column. I want that text as HTML text.

我几乎已经完成了整个POI API但却无法找到任何人。我想保留一个特定列的格式,而不是整个excel。按列我的意思是,在该列中输入的文本。我希望该文本为HTML文本。

Explored and used Apache Tika also. However as I understand it can only get me the text but not the formatting of the text.

探索和使用Apache Tika也。但据我所知它只能得到文本而不是文本的格式。

Please someone guide me. I am running out of options.


Suppose I wrote My name is Angel and Demon in Excel.


The output I should get in Java is My name is <b>Angel</b> and <i>Demon</i>

我应该用Java获得的输出是我的名字是 Angel 和 Demon

1 个解决方案



I've paste this as unicode to cell A1 of xls file:


<html><p>This is a test. Will this text be <b>bold</b> or <i>italic</i></p></html>

This html line produce this:


This is a test. Will this text be bold or italic


My code:


public class ExcelWithHtml {
    // <html><p>This is a test. Will this text be <b>bold</b> or
    // <i>italic</i></p></html>

    public static void main(String[] args) throws FileNotFoundException,
            IOException {
        new ExcelWithHtml()

    boolean inBold = false;
    boolean inItalic = false;

    public void readFirstCellOfXSSF(String filePathName)
            throws FileNotFoundException, IOException {
        FileInputStream fis = new FileInputStream(filePathName);
        XSSFWorkbook wb = new XSSFWorkbook(fis);
        XSSFSheet sheet = wb.getSheetAt(0);

        String cellHtml = getHtmlFormatedCellValueFromSheet(sheet, "A1");



    public String getHtmlFormatedCellValueFromSheet(XSSFSheet sheet,
            String cellName) {

        CellReference cellReference = new CellReference(cellName);
        XSSFRow row = sheet.getRow(cellReference.getRow());
        XSSFCell cell = row.getCell(cellReference.getCol());

        XSSFRichTextString cellText = cell.getRichStringCellValue();

        String htmlCode = "";
        // htmlCode = "<html>";

        for (int i = 0; i < cellText.numFormattingRuns(); i++) {
            try {
                htmlCode += getFormatFromFont(cellText.getFontAtIndex(i));
            } catch (NullPointerException ex) {
            try {
                htmlCode += getFormatFromFont(cellText
            } catch (NullPointerException ex) {

            int indexStart = cellText.getIndexOfFormattingRun(i);
            int indexEnd = indexStart + cellText.getLengthOfFormattingRun(i);

            htmlCode += cellText.getString().substring(indexStart, indexEnd);

        if (inItalic) {
            htmlCode += "</i>";
            inItalic = false;
        if (inBold) {
            htmlCode += "</b>";
            inBold = false;

        // htmlCode += "</html>";
        return htmlCode;


    private String getFormatFromFont(XSSFFont font) {
        String formatHtmlCode = "";
        if (font.getItalic() && !inItalic) {
            formatHtmlCode += "<i>";
            inItalic = true;
        } else if (!font.getItalic() && inItalic) {
            formatHtmlCode += "</i>";
            inItalic = false;

        if (font.getBold() && !inBold) {
            formatHtmlCode += "<b>";
            inBold = true;
        } else if (!font.getBold() && inBold) {
            formatHtmlCode += "</b>";
            inBold = false;

        return formatHtmlCode;


My output:


This is a test. Will this text be <b>bold</b> or <i>italic</i>

I think it is what you want, i'm only show you the possibilities, i'm not using the best code practices, i'm just programming fast to produce an output.




I've paste this as unicode to cell A1 of xls file:


<html><p>This is a test. Will this text be <b>bold</b> or <i>italic</i></p></html>

This html line produce this:


This is a test. Will this text be bold or italic


My code:


public class ExcelWithHtml {
    // <html><p>This is a test. Will this text be <b>bold</b> or
    // <i>italic</i></p></html>

    public static void main(String[] args) throws FileNotFoundException,
            IOException {
        new ExcelWithHtml()

    boolean inBold = false;
    boolean inItalic = false;

    public void readFirstCellOfXSSF(String filePathName)
            throws FileNotFoundException, IOException {
        FileInputStream fis = new FileInputStream(filePathName);
        XSSFWorkbook wb = new XSSFWorkbook(fis);
        XSSFSheet sheet = wb.getSheetAt(0);

        String cellHtml = getHtmlFormatedCellValueFromSheet(sheet, "A1");



    public String getHtmlFormatedCellValueFromSheet(XSSFSheet sheet,
            String cellName) {

        CellReference cellReference = new CellReference(cellName);
        XSSFRow row = sheet.getRow(cellReference.getRow());
        XSSFCell cell = row.getCell(cellReference.getCol());

        XSSFRichTextString cellText = cell.getRichStringCellValue();

        String htmlCode = "";
        // htmlCode = "<html>";

        for (int i = 0; i < cellText.numFormattingRuns(); i++) {
            try {
                htmlCode += getFormatFromFont(cellText.getFontAtIndex(i));
            } catch (NullPointerException ex) {
            try {
                htmlCode += getFormatFromFont(cellText
            } catch (NullPointerException ex) {

            int indexStart = cellText.getIndexOfFormattingRun(i);
            int indexEnd = indexStart + cellText.getLengthOfFormattingRun(i);

            htmlCode += cellText.getString().substring(indexStart, indexEnd);

        if (inItalic) {
            htmlCode += "</i>";
            inItalic = false;
        if (inBold) {
            htmlCode += "</b>";
            inBold = false;

        // htmlCode += "</html>";
        return htmlCode;


    private String getFormatFromFont(XSSFFont font) {
        String formatHtmlCode = "";
        if (font.getItalic() && !inItalic) {
            formatHtmlCode += "<i>";
            inItalic = true;
        } else if (!font.getItalic() && inItalic) {
            formatHtmlCode += "</i>";
            inItalic = false;

        if (font.getBold() && !inBold) {
            formatHtmlCode += "<b>";
            inBold = true;
        } else if (!font.getBold() && inBold) {
            formatHtmlCode += "</b>";
            inBold = false;

        return formatHtmlCode;


My output:


This is a test. Will this text be <b>bold</b> or <i>italic</i>

I think it is what you want, i'm only show you the possibilities, i'm not using the best code practices, i'm just programming fast to produce an output.
