Apache的FTPClient.listFiles()返回空有关问题讨论解决

Apache的FTPClient.listFiles()返回空问题讨论解决

偶然间鄙人也碰到了Apache的FTPClient.listFiles()获取文件为空的问题。

目标服务器环境:HP小型机

client服务器环境:Linux jstmsapp2 2.6.32-279.el6.x86_64 #1 SMP Wed Jun 13 18:24:36 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux(脚本在此服务器上执行)

相关jar:common-net-1.4.1.jar(common-net-3.3.jar依旧有这个问题)、jakarta-oro-2.0.8.jar


我的代码如下:

    /**
     * @desc: 从目标服务器FTP取文件到本地
     * @author<chengsheng.wang@zznode.com>
     * @since 2015-7-27
     *
     * @param url
     * @param userName
     * @param password
     * @param portnum
     * @param path
     * @param localPath
     * @return boolean
     */
    private boolean downLoadFromFtp(String url, String userName, String password,int portnum ,String path , String localPath){
    	logger.info("url=" + url + "  username=" + userName + "  password=" + password + "  hostpath=" + path + "  localpath=" + localPath);
    	boolean flag = false;
    	FTPClient ftpClient = new FTPClient();
		ftpClient.setControlEncoding("GBK");
		int count = 0;//同步文件计数
		try {
			ftpClient.connect(url, portnum);
			boolean loginFlag = ftpClient.login(userName, password);
			logger.info("登陆状态:"+loginFlag);
			ftpClient.changeWorkingDirectory(path);
			ftpClient.enterLocalPassiveMode(); 
			FTPFile[] files = ftpClient.listFiles();
			if (null == files || files.length == 0) {
				logger.info("没有文件数据");
				return flag;
			}
			File tempfile = null;
			FileOutputStream fos = null;
			File localpathdir = new File(localPath);
			if (!localpathdir.exists()) {
				localpathdir.mkdirs();
			}
			logger.info("host目录文件总数:"+files.length);
			for (int i = 0; i < files.length; i++) {
				if(files[i] == null){
					continue;
				}
				String fileName = files[i].getName();
				logger.info("第" + i + "个文件名:" + fileName);
				String local = localPath + File.separator + fileName;
				tempfile = new File(local);
				if(tempfile.exists()){
					continue;//如文件已经存在,则不再重复下载/同步
				}
				fos = new FileOutputStream(tempfile);
				ftpClient.setBufferSize(1024);
				ftpClient.setFileType(FTPClient.BINARY_FILE_TYPE);
				ftpClient.retrieveFile(path + File.separator + fileName, fos);
				fos.close();
				count++;
			}
			flag = true;
		} catch (SocketException e) {
			logger.error("socket异常", e);
		} catch (IOException e) {
			logger.error("IO异常", e);
		} catch (Exception e) {
			logger.error("ftp下载文件异常", e);
		}finally{
			if (null != ftpClient) {
				try {
					if (ftpClient.isConnected()) {
						ftpClient.logout();   
						ftpClient.disconnect();
					}
				} catch (IOException e) {
					logger.error("关闭连接异常", e);
				}
			}
			logger.info("本次一共同步了"+count+"个文件");
		}
    	return flag;
    }
执行到FTPClient.listFiles(),死活返回为空。

网上研究了很久,受一些前辈的启发,推测原因是目标服务器的中文语言环境,导致文件的修改日期格式,不能被apache正确解析造成的。

从网上找来common-net-1.4.1.jar的源码:http://apache.fayea.com//commons/net/source/commons-net-1.4.1-src.zip

在源码中直接加入日志调试,然后FTPClient.listFiles()返回null问题就豁然开朗了。


common-net-1.4.1.jar中问题,来一一说明一下:

UnixFTPEntryParser.java中parseFTPEntry

    /**
     * Parses a line of a unix (standard) FTP server file listing and converts
     * it into a usable format in the form of an <code> FTPFile </code>
     * instance.  If the file listing line doesn't describe a file,
     * <code> null </code> is returned, otherwise a <code> FTPFile </code>
     * instance representing the files in the directory is returned.
     * <p>
     * @param entry A line of text from the file listing
     * @return An FTPFile instance corresponding to the supplied entry
     */
	public FTPFile parseFTPEntry(String entry) {
        FTPFile file = new FTPFile();
        file.setRawListing(entry);
        int type;
        boolean isDevice = false;

        if (matches(entry))//此处匹配文件信息的正则表达式也有问题,写死在上面,其匹配规则导致某些文件因为最后修改日期信息被过滤
        {
            String typeStr = group(1);
            String hardLinkCount = group(15);
            String usr = group(16);
            String grp = group(17);
            String filesize = group(18);
            String datestr = group(19) + " " + group(20);
            String name = group(21);
            String endtoken = group(22);

            try
            {
                file.setTimestamp(super.parseTimestamp(datestr));  //问题出在此处,由于语言环境引起的文件日期格式无法被解析,而导致return null并隐藏了解析错误的异常信息
            }
            catch (ParseException e)
            {
            	return null;  // this is a parsing failure too.
            }
有问题的正则表达式

    /**
     * this is the regular expression used by this parser.
     *
     * Permissions:
     *    r   the file is readable
     *    w   the file is writable
     *    x   the file is executable
     *    -   the indicated permission is not granted
     *    L   mandatory locking occurs during access (the set-group-ID bit is
     *        on and the group execution bit is off)
     *    s   the set-user-ID or set-group-ID bit is on, and the corresponding
     *        user or group execution bit is also on
     *    S   undefined bit-state (the set-user-ID bit is on and the user
     *        execution bit is off)
     *    t   the 1000 (octal) bit, or sticky bit, is on [see chmod(1)], and
     *        execution is on
     *    T   the 1000 bit is turned on, and execution is off (undefined bit-
     *        state)
     */
    private static final String REGEX =
        "([bcdlfmpSs-])"
        +"(((r|-)(w|-)([xsStTL-]))((r|-)(w|-)([xsStTL-]))((r|-)(w|-)([xsStTL-])))\\+?\\s+"
        + "(\\d+)\\s+"
        + "(\\S+)\\s+"
        + "(?:(\\S+)\\s+)?"
        + "(\\d+)\\s+"
        
        /*
          numeric or standard format date
        */
        + "((?:\\d+[-/]\\d+[-/]\\d+)|(?:\\S+\\s+\\S+))\\s+" //这句有问题,某些文件被过滤了,不过hp机器某些文件的修改日期中文格式也的确匪夷所思
		
        /* 
           year (for non-recent standard format) 
		   or time (for numeric or recent standard format  
		*/
		+ "(\\d+(?::\\d+)?)\\s+"
        
		+ "(\\S*)(\\s*.*)";
既然问题原因都知道了,那么讨论下解决方案

网上有前辈简洁地指出,把这两个地方修改了不就行了。

如果你都不关心文件的the last modification time,那么最方便的做法是:

file.setTimestamp(Calendar.getInstance()); //把文件的the last modification time重置为当前时间,这样其实并没有什么不妥。

正则表达式也可以照葫芦画瓢改为:"((?:\\d+[-/]\\d+[-/]\\d+)|(?:\\S+\\s+\\S+)|(?:\\S+))\\s+"

重新编译一个新的common-net.1..4.1.jar然后执行一遍,世界终于安宁了,一切美好。

请参考:http://www.blogjava.net/wodong/archive/2008/08/21/wodong.html


但是这样做真的好么?优雅么?apache的贡献者们的代码其实还是留有余地让我们去完善这个bug。

从org.apache.commons.net.ftp.FTPClient.listFiles()方法逐步去过一遍代码吧

listFiles()最终调用了

public FTPFile[] listFiles(String pathname)
    throws IOException
    {
        String key = null;
        FTPListParseEngine engine =
            initiateListParsing(key, pathname);
        return engine.getFiles();

    }
然后我们继续分析initiateListParsing(key, pathname)

    public FTPListParseEngine initiateListParsing(
            String parserKey, String pathname)
    throws IOException
    {
        // We cache the value to avoid creation of a new object every
        // time a file listing is generated.
        if(__entryParser == null) {
            if (null != parserKey) {
                // if a parser key was supplied in the parameters, 
                // use that to create the paraser
        	    __entryParser = 
        	        __parserFactory.createFileEntryParser(parserKey);
                
            } else {
	            // if no parserKey was supplied, check for a configuration
	        	// in the params, and if non-null, use that.
            	if (null != __configuration) {
            	    __entryParser = 
            	        __parserFactory.createFileEntryParser(__configuration);
            	} else {
                    // if a parserKey hasn't been supplied, and a configuration
            	    // hasn't been supplied, then autodetect by calling
                    // the SYST command and use that to choose the parser.
            	    __entryParser = 
            	        __parserFactory.createFileEntryParser(getSystemName());
             	}
            }
        }

        return initiateListParsing(__entryParser, pathname);

    }
发现其实是可以通过__configuration参数去初始化__entryParser的。而默认__configuration为null,导致了程序执行到

__entryParser =    __parserFactory.createFileEntryParser(getSystemName());  //初始化了一个不支持正文格式的Parser 


继续假设我们已经new一个FTPClientConfig,通过FTPClientConfig来初始化Parser,继续跟代码

	public FTPFileEntryParser createFileEntryParser(FTPClientConfig config) 
	throws ParserInitializationException 
	{
	    this.config = config;
		String key = config.getServerSystemKey();
		return createFileEntryParser(key);
	}
进入createFileEntryParser(key)方法,揭示最终的真相

    public FTPFileEntryParser createFileEntryParser(String key)
    {
        Class parserClass = null;
        FTPFileEntryParser parser = null;
        try
        {
            parserClass = Class.forName(key);//如果我们利用key来初始化一个自定义的FTPFileEntryParser是不是可以呢,key是从FTPClientConfig中传递而来
            parser = (FTPFileEntryParser) parserClass.newInstance();
        }
        catch (ClassNotFoundException e)
        {
            String ukey = null;
            if (null != key)
            {
                ukey = key.toUpperCase();
            }
            if (ukey.indexOf(FTPClientConfig.SYST_UNIX) >= 0)
            {
                parser = createUnixFTPEntryParser();
            }
            else if (ukey.indexOf(FTPClientConfig.SYST_VMS) >= 0)
            {
                parser = createVMSVersioningFTPEntryParser();
            }
            else if (ukey.indexOf(FTPClientConfig.SYST_NT) >= 0)
            {
                parser = createNTFTPEntryParser();
            }
            else if (ukey.indexOf(FTPClientConfig.SYST_OS2) >= 0)
            {
                parser = createOS2FTPEntryParser();
            }
            else if (ukey.indexOf(FTPClientConfig.SYST_OS400) >= 0)
            {
                parser = createOS400FTPEntryParser();
            }
            else if (ukey.indexOf(FTPClientConfig.SYST_MVS) >= 0)
            {
                parser = createMVSEntryParser();
        	}
            else
            {
                throw new ParserInitializationException("Unknown parser type: " + key);
            }
        }
        catch (ClassCastException e)
        {
            throw new ParserInitializationException(parserClass.getName()
                + " does not implement the interface "
                + "org.apache.commons.net.ftp.FTPFileEntryParser.", e);
        }
        catch (Throwable e)
        {
            throw new ParserInitializationException("Error initializing parser", e);
        }

        if (parser instanceof Configurable) {
            ((Configurable)parser).configure(this.config);
        }    
        return parser;
    }

细心的网友一定发现了,我们可以通过给FTPClient对象设置一个FTPClientConfig,

通过FTPClientConfig的systemKey属性,初始化一个自定义的FTPFileEntryParser去完成文件时间等信息的解析工作。

再来看看FTPClientConfig是不是有符合要求的构造器

	/**
	 * The main constructor for an FTPClientConfig object
	 * @param systemKey key representing system type of the  server being 
	 * connected to. See {@link #getServerSystemKey() serverSystemKey}
	 */
	public FTPClientConfig(String systemKey) {
		this.serverSystemKey = systemKey;
	}
太好了,恰好有这么一个构造器,可以开工了。


新建一个UnixFTPEntryParser,继承自ConfigurableFTPFileEntryParserImpl,然后在ftpClient.listFiles();调用前,初始化一个FTPClientConfig给ftpClient对象。

看代码:

ftpClient.changeWorkingDirectory(path);
			ftpClient.enterLocalPassiveMode(); 
			//由于apache不支持中文语言环境,通过定制类解析中文日期类型
			ftpClient.configure(new FTPClientConfig("com.zznode.tnms.ra.c11n.nj.resource.ftp.UnixFTPEntryParser"));
			FTPFile[] files = ftpClient.listFiles();



终于结束了,很累,末尾附上自定义的UnixFTPEntryParser.java和FTPTimestampParserImplExZH.java(用于处理中文日期,不关心修改日期的网友也可以不用它):

http://download.csdn.net/detail/wangchsh2008/8939331

本篇解决问题思路参考网上各位前辈,解决方案经本人实际应用验证可用,特发帖供网友参考。









版权声明:本文为博主原创文章,未经博主允许不得转载。