Java xml错误编码utf-8

时间:2022-10-24 23:35:20

Well When I write it shows strange characters, I have been reading and I have to use FileOutputStream to solve the problem, but I am very new and I do not know how to do it. My code is wrong, there is an error doing, build (xml) and I do not know if I would write the output file in this way.

好吧当我写它显示奇怪的字符,我一直在阅读,我必须使用FileOutputStream来解决问题,但我很新,我不知道该怎么做。我的代码是错误的,有一个错误,build(xml),我不知道我是否会以这种方式编写输出文件。

<?xml version="1.0" encoding="UTF-8"?>
 <prueba>
     <reg id="576340">
           <dato cant="680" id="1" val="-1" num="" desc="résd" />
           <dato cant="684" id="5" val="-1" num="" desc="да и вообще" /> 
           <dato cant="1621" id="1" val="-1" num="" desc="Hi" />
           <dato cant="1625" id="5" val="-1" num="" desc="Hola" />  
     </reg>
 </prueba>


public static void main(String[] args) throws FileNotFoundException, 
     JDOMException, IOException {

SAXBuilder builder = new SAXBuilder();
File xml = new File("c:\\prueba3.xml");
Writer out = new BufferedWriter(new OutputStreamWriter(
new FileOutputStream(xml), "UTF8"));
Document doc = (Document) new SAXBuilder().build(xml);
Element raiz = doc.getRootElement();
List articleRow = raiz.getChildren("reg"); 

for (int i = 0; i < articleRow.size(); i++) {

    Element row = (Element) articleRow.get(i);
    List images = row.getChildren("dato");

     for (int j = 0; j < images.size(); j++) {

         Element row2 = (Element) images.get(j);
         String texto = row2.getAttributeValue("desc") ;
         String id = row2.getAttributeValue("id"); 

         if ((texto != null) && (texto !="") && 
            (id.equals("1") || id.equals("2"))){                   

         //row2.getChild("desc").setText("valor");   
         out.append(row2.getAttribute("desc").setValue.
                   ("raúl").toString());
         }
     }
}
 out.flush();
 out.close();
 System.out.println("fin de programa");  
}

These are the output data

这些是输出数据

<?xml version="1.0" encoding="UTF-8"?>
 <prueba>
    <reg id="576340">
           <dato cant="680" id="1" val="-1" num="" desc="ra????/>
           <dato cant="684" id="5" val="-1" num="" desc="..?? ? ??????/>
           <dato cant="1621" id="1" val="-1" num="" desc="ra????/>
           <dato cant="1625" id="5" val="-1" num="" desc="Hola" />
    </reg>
  </prueba>  

Log Error

Exception in thread "main" org.jdom.input.JDOMParseException: Error on line 1 of document file:/c:/prueba3.xml: Final de archivo prematuro.
at org.jdom.input.SAXBuilder.build(SAXBuilder.java:530)
at org.jdom.input.SAXBuilder.build(SAXBuilder.java:905)
at org.jdom.input.SAXBuilder.build(SAXBuilder.java:884)
at Prueba.main(Prueba.java:27)Caused by: org.xml.sax.SAXParseException; systemId: file:/c:/prueba3.xml; lineNumber: 1; columnNumber: 1; Final de archivo prematuro.
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
at org.jdom.input.SAXBuilder.build(SAXBuilder.java:518)
... 3 moreCaused by: org.xml.sax.SAXParseException; systemId: file:/c:/prueba3.xml; lineNumber: 1; columnNumber: 1; Final de archivo prematuro.
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
at org.jdom.input.SAXBuilder.build(SAXBuilder.java:518)
at org.jdom.input.SAXBuilder.build(SAXBuilder.java:905)
at org.jdom.input.SAXBuilder.build(SAXBuilder.java:884)
at Prueba.main(Prueba.java:27)

I would appreciate your help.

我很感激你的帮助。

2 个解决方案

#1


2  

Depending of the target encoding you have to decide how this will be written to the filesystem. You decided to write with 'UTF8'.

根据目标编码,您必须决定如何将其写入文件系统。你决定写'UTF8'。

Writer out = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(xml), "UTF8"));

Writer out = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(xml),“UTF8”));

You have to make sure that the program which loads the date knows it is encoded in UTF-8. E.g. notepad++ allows to choose a different encoding than the system default. In most cases UTF-8 is not system default. so you have to give the information during loading of the files.

您必须确保加载日期的程序知道它是以UTF-8编码的。例如。 notepad ++允许选择与系统默认值不同的编码。在大多数情况下,UTF-8不是系统默认值。所以你必须在加载文件时提供信息。

Please also check Java FileReader encoding issue

另请检查Java FileReader编码问题

#2


1  

A example file with this content:

包含此内容的示例文件:

<?xml version="1.0" encoding="UTF-8"?>
 <prueba>
     <reg id="123456">
           <dato cantidad="680" id="1" val="-1" num="" desc="résd" />
           <dato cantidad="684" id="5" val="-1" num="" desc="да и вообще" /> 
           <dato cantidad="1621" id="1" val="-1" num="" desc="Hi" />
           <dato cantidad="1625" id="5" val="-1" num="" desc="Hola" />  
     </reg>
 </prueba>

can be parsed using DOM native java

可以使用DOM本机java解析

Example:

public static void main(String[] args) throws IOException, ParserConfigurationException, SAXException {
    final File fXmlFile = new File("./Details2.xml");
    final DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
    final DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
    final Document document = dBuilder.parse(fXmlFile);
    document.getDocumentElement().normalize();
    final NodeList regNodeList = document.getElementsByTagName("reg");
    for (int counter = 0; counter < regNodeList.getLength(); counter++) {
        final Node nNode = regNodeList.item(counter);
        System.out.println("Current Element :" + nNode.getNodeName());
        System.out.println("regs id : " + ((Element) nNode).getAttribute("id"));
        final NodeList nList2 = ((Element) nNode).getElementsByTagName("dato");

        for (int counterChilds = 0; counterChilds < nList2.getLength(); counterChilds++) {
        final Node nNode2 = nList2.item(counterChilds);
        if (nNode2.getNodeType() == Node.ELEMENT_NODE) {
            final Element eElement = (Element) nNode2;
            System.out.println(String.format("Cantidad %s,id %s,val %s,num %s,Desc %s",
            eElement.getAttribute("cantidad"), eElement.getAttribute("id"),
            eElement.getAttribute("val"), eElement.getAttribute("num"), eElement.getAttribute("desc")));
    }
    }
}
}

#1


2  

Depending of the target encoding you have to decide how this will be written to the filesystem. You decided to write with 'UTF8'.

根据目标编码,您必须决定如何将其写入文件系统。你决定写'UTF8'。

Writer out = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(xml), "UTF8"));

Writer out = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(xml),“UTF8”));

You have to make sure that the program which loads the date knows it is encoded in UTF-8. E.g. notepad++ allows to choose a different encoding than the system default. In most cases UTF-8 is not system default. so you have to give the information during loading of the files.

您必须确保加载日期的程序知道它是以UTF-8编码的。例如。 notepad ++允许选择与系统默认值不同的编码。在大多数情况下,UTF-8不是系统默认值。所以你必须在加载文件时提供信息。

Please also check Java FileReader encoding issue

另请检查Java FileReader编码问题

#2


1  

A example file with this content:

包含此内容的示例文件:

<?xml version="1.0" encoding="UTF-8"?>
 <prueba>
     <reg id="123456">
           <dato cantidad="680" id="1" val="-1" num="" desc="résd" />
           <dato cantidad="684" id="5" val="-1" num="" desc="да и вообще" /> 
           <dato cantidad="1621" id="1" val="-1" num="" desc="Hi" />
           <dato cantidad="1625" id="5" val="-1" num="" desc="Hola" />  
     </reg>
 </prueba>

can be parsed using DOM native java

可以使用DOM本机java解析

Example:

public static void main(String[] args) throws IOException, ParserConfigurationException, SAXException {
    final File fXmlFile = new File("./Details2.xml");
    final DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
    final DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
    final Document document = dBuilder.parse(fXmlFile);
    document.getDocumentElement().normalize();
    final NodeList regNodeList = document.getElementsByTagName("reg");
    for (int counter = 0; counter < regNodeList.getLength(); counter++) {
        final Node nNode = regNodeList.item(counter);
        System.out.println("Current Element :" + nNode.getNodeName());
        System.out.println("regs id : " + ((Element) nNode).getAttribute("id"));
        final NodeList nList2 = ((Element) nNode).getElementsByTagName("dato");

        for (int counterChilds = 0; counterChilds < nList2.getLength(); counterChilds++) {
        final Node nNode2 = nList2.item(counterChilds);
        if (nNode2.getNodeType() == Node.ELEMENT_NODE) {
            final Element eElement = (Element) nNode2;
            System.out.println(String.format("Cantidad %s,id %s,val %s,num %s,Desc %s",
            eElement.getAttribute("cantidad"), eElement.getAttribute("id"),
            eElement.getAttribute("val"), eElement.getAttribute("num"), eElement.getAttribute("desc")));
    }
    }
}
}