java解析xml文件多个线程同时访问时内存溢出

问题描述：

java解析xml文件多个线程同时访问时内存溢出，代码如下：

解析的xml文件大小大概2M-10M，该怎么解决呢？

String sourceFile = ConfigReader.get("data.store.path") + article.getSourceFile();
        
        org.dom4j.Document document = null;
        FileInputStream fin = null;
        try {
            fin = new FileInputStream(new File(sourceFile));
            document = new SAXReader().read(fin);
        } catch (final FileNotFoundException e) {
            final String msg = "内容单元xml文件不存在：" + sourceFile;
            log.error(msg, e);
            throw new RuntimeException(msg, e);
        } catch (final DocumentException e) {
            final String msg = "解析内容单元xml文件失败：" + sourceFile;
            log.error(msg, e);
            throw new RuntimeException(msg, e);
        } finally {
            IOUtils.closeQuietly(fin);
        }
        
        final StringBuilder xpath = new StringBuilder("/KFMP/DOCS/DOC[@GUID='")
            .append(article.getGuid()).append("']/").append("CONTENT");

        final Node node = document.selectSingleNode(xpath.toString());
        final String value = node.getText();
        article.setContent(value);
        
        final StringBuilder coordXpath = new StringBuilder("/KFMP/DOCS/DOC[@GUID='")
        .append(article.getGuid()).append("']/").append("COORDS");
        final Node coordsNode = document.selectSingleNode(coordXpath.toString());
        final String coordsValue = coordsNode.getText();
        article.setCoords(coordsValue);
        
        final StringBuilder briefXpath = new StringBuilder("/KFMP/DOCS/DOC[@GUID='")
        .append(article.getGuid()).append("']/").append("BRIEF");
        final Node briefNode = document.selectSingleNode(briefXpath.toString());
        final String briefValue = briefNode.getText();
        article.setBrief(briefValue);
        
        return article;

答

建议使用sax解析，dom解析会将所有的数据读入内存。

或者进行增加硬件设施，或优化JVM等操作。

答

如果是多个线程访问多个xml的话，那只能限制线程数量。
如果是多个线程访问一个xml的话，那就第一个线程解析完后就缓存。后续多线程只读缓存。

答

SAX方式解析xml也会内存溢出? 应该不会吧!

答

加大XMX，32位的最大只能用1.5G

答

每次一个document初始化好以后，放到一个map里，文件名作为key，用的时候，先看看map里面有没有，没有再初始化一个document，这样保证每一个xml文件只生成一个document对象。

如果这样都会溢出，那就是你的xml文件太大了，设计不合理，试着拆成多个小文件。

答

如果是文件太大就拆分成多个xml文件然后在分别解析

答

那就创建两个org.dom4j.Document
一个用来读数据，一个用来操作数据
想办法保证两个数据的同步

java解析xml文件多个线程同时访问时内存溢出

java解析xml文件多个线程同时访问时内存溢出，代码如下：

相关推荐