smp8671/android上yaffs2文件莫名丢失有关问题分析2
smp8671/android下yaffs2文件莫名丢失问题分析2
解决了前一篇文章中说的问题后,当时以为真的解决问题了。周五晚升级软件,继续测试,下周一看运行状况。结果很不幸,问题又出现了,还是丢失文件的问题。看了上一篇改的代码,看来改的另外一个问题,yaffs多进程会挂起。这次出现的问题才是真正的丢失文件问题。看来这个版本里yaffs有好多问题啊。真不知道sigma-design怎么搞的,难道都没测出来这些问题么?
继续调试问题,根据打印发现经常有so库找不到,libui.so,libdvm.so,libmedia_jni.so都可能找不到,有时甚至找不到2个库。
yaffs_get_obj_name中相关代码如下,与yaffs_check_obj_details_loaded类似,都是申请一块缓存,然后调用yaffs_rd_chunk_tags_nand(),不同的是yaffs_get_obj_name中多了memset(buffer, 0, obj->my_dev->data_bytes_per_chunk);
printk ("windsome%d: %d mtd read error! why???? len=%d, dummy_len=%d, nand_chunk=%d, addr=0x%llx, retval=%d\n",
windsome_path, windsome_read_twice, dev->param.total_bytes_per_chunk, dummy, nand_chunk, addr, retval);
根据如上调试信息,通过修改nandmtd2_ReadChunkWithTagsFromNAND函数暂时解决此问题。
可能有另外的解决办法,修改yaffs_rd_chunk_tags_nand,在里面增加memset(),可以试试。
最终的原因,我并没有查出来,是板子稳定性问题导致驱动不正常?或者smp8xxx_nand驱动本身有问题,有待继续调试。
解决了前一篇文章中说的问题后,当时以为真的解决问题了。周五晚升级软件,继续测试,下周一看运行状况。结果很不幸,问题又出现了,还是丢失文件的问题。看了上一篇改的代码,看来改的另外一个问题,yaffs多进程会挂起。这次出现的问题才是真正的丢失文件问题。看来这个版本里yaffs有好多问题啊。真不知道sigma-design怎么搞的,难道都没测出来这些问题么?
继续调试问题,根据打印发现经常有so库找不到,libui.so,libdvm.so,libmedia_jni.so都可能找不到,有时甚至找不到2个库。
bionic/linker/linker.c:2239| ERROR: 1817 could not load needed library 'libui.so' for 'libmedia.so' (load_library[1229]: Library 'libui.so' not found)查找一个文件,首先会运行yaffs_lookup(),在fs/yaffs2中增加打印信息,yaffs_vfs_glue.c中增加log级别,修改如下:
unsigned int yaffs_trace_mask =YAFFS_TRACE_BAD_BLOCKS | YAFFS_TRACE_ALWAYS | YAFFS_TRACE_OS | YAFFS_TRACE_TRACING | YAFFS_TRACE_SCAN;得到新的日志信息:
[ 32.700000] yaffs_lookup dentry [ 32.704000] yaffs_lookup for dev=90ccf000, 740:libui.so [ 32.732000] yaffs_lookup not found [ 32.736000] yaffs_lookup for dev=90c53000, 297:libui.so [ 32.768000] yaffs_lookup not found从上面的打印可以看到libui.so没有找到。继续跟踪yaffs_lookup函数,发现yaffs_find_by_name返回的是空。修改此函数,在此函数中多进行一次查找。结果发现,第二次查找却找到了此文件。说明flash初始化是正确的,但偶尔却找不到文件。修改函数如下:
yaffs_obj_t *yaffs_find_by_name(yaffs_obj_t *directory, const YCHAR *name) { int sum; struct ylist_head *i; YCHAR buffer[YAFFS_MAX_NAME_LENGTH + 1]; yaffs_obj_t *l; int windsome_obj_name_len = 0; if (!name) return NULL; if (!directory) { T(YAFFS_TRACE_ALWAYS, (TSTR ("tragedy: yaffs_find_by_name: null pointer directory" TENDSTR))); YBUG(); return NULL; } if (directory->variant_type != YAFFS_OBJECT_TYPE_DIRECTORY) { T(YAFFS_TRACE_ALWAYS, (TSTR ("tragedy: yaffs_find_by_name: non-directory" TENDSTR))); YBUG(); } sum = yaffs_calc_name_sum(name); T(YAFFS_TRACE_TRACING, (TSTR("yaffs_find_by_name to find: name=%s, len=%d, sum=%u\n"), name, yaffs_strnlen(name, YAFFS_MAX_NAME_LENGTH), sum)); ylist_for_each(i, &directory->variant.dir_variant.children) { if (i) { l = ylist_entry(i, yaffs_obj_t, siblings); if (l->parent != directory) { T(YAFFS_TRACE_ALWAYS, (TSTR ("tragedy: yaffs_find_by_name1: parent diff" TENDSTR))); YBUG(); } yaffs_check_obj_details_loaded(l); /* Special case for lost-n-found */ if (l->obj_id == YAFFS_OBJECTID_LOSTNFOUND) { if (yaffs_strcmp(name, YAFFS_LOSTNFOUND_NAME) == 0) { T(YAFFS_TRACE_TRACING, (TSTR("yaffs_find_by_name1: name=%s\n"), name)); return l; } } else if (yaffs_sum_cmp(l->sum, sum) || l->hdr_chunk <= 0) { /* LostnFound chunk called Objxxx * Do a real check */ T(YAFFS_TRACE_TRACING, (TSTR("yaffs_find_by_name1: sum=%u, hdr_chunk=%d, short_name=%s\n"), l->sum, l->hdr_chunk, l->short_name?:"")); windsome_obj_name_len = yaffs_get_obj_name(l, buffer, YAFFS_MAX_NAME_LENGTH + 1); T(YAFFS_TRACE_TRACING, (TSTR("yaffs_find_by_name1: obj_name=%s, windsome_obj_name_len=%d\n"), buffer, windsome_obj_name_len)); if (yaffs_strncmp(name, buffer, YAFFS_MAX_NAME_LENGTH) == 0) { T(YAFFS_TRACE_TRACING, (TSTR("yaffs_find_by_name1: name=%s, obj_name=%s\n"), name, buffer)); return l; } } } } T(YAFFS_TRACE_OS, (TSTR("yaffs_find_by_name: sum=%d, ret=NULL, try again! only for print\n"), sum)); ylist_for_each(i, &directory->variant.dir_variant.children) { if (i) { l = ylist_entry(i, yaffs_obj_t, siblings); if (l->parent != directory) { T(YAFFS_TRACE_ALWAYS, (TSTR ("tragedy: yaffs_find_by_name2: parent diff" TENDSTR))); YBUG(); } yaffs_check_obj_details_loaded(l); /* Special case for lost-n-found */ if (l->obj_id == YAFFS_OBJECTID_LOSTNFOUND) { if (yaffs_strcmp(name, YAFFS_LOSTNFOUND_NAME) == 0) { T(YAFFS_TRACE_TRACING, (TSTR("yaffs_find_by_name2: name=%s\n"), name)); return NULL; } } else if (yaffs_sum_cmp(l->sum, sum) || l->hdr_chunk <= 0) { /* LostnFound chunk called Objxxx * Do a real check */ T(YAFFS_TRACE_TRACING, (TSTR("yaffs_find_by_name2: sum=%u, hdr_chunk=%d, short_name=%s\n"), l->sum, l->hdr_chunk, l->short_name?:"")); windsome_obj_name_len = yaffs_get_obj_name(l, buffer, YAFFS_MAX_NAME_LENGTH + 1); T(YAFFS_TRACE_TRACING, (TSTR("yaffs_find_by_name2: obj_name=%s, windsome_obj_name_len=%d\n"), buffer, windsome_obj_name_len)); if (yaffs_strncmp(name, buffer, YAFFS_MAX_NAME_LENGTH) == 0) { T(YAFFS_TRACE_TRACING, (TSTR("yaffs_find_by_name1: name=%s, obj_name=%s\n"), name, buffer)); return NULL; } } } } T(YAFFS_TRACE_OS, (TSTR("yaffs_find_by_name: final ret=NULL\n"))); return NULL; }为了确定丢失的文件inode是否真的存在,在yaffs_add_obj_to_dir中加入打印
+ T(YAFFS_TRACE_TRACING, (TSTR("yaffs_add_obj_to_dir: sum=%u, obj_id=%u, short_name=%s, parent:(%u,%u,%s)\n"), obj->sum, obj->obj_id, obj->short_name, directory->sum, directory->obj_id, directory->short_name)); /* Now add it */ ylist_add(&obj->siblings, &directory->variant.dir_variant.children); obj->parent = directory;从得到的日志中发现libui.so的obj_id=879确实存在,通过看代码感觉与short_name有莫大关系。
[ 17.552000] yaffs_add_obj_to_dir: sum=0, obj_id=0, short_name=, parent:(0,2,) [ 17.560000] yaffs_add_obj_to_dir: sum=0, obj_id=879, short_name=, parent:(0,740,) [ 17.568000] yaffs_add_obj_to_dir: sum=0, obj_id=0, short_name=, parent:(0,2,) [ 17.576000] yaffs_add_obj_to_dir: sum=0, obj_id=368, short_name=, parent:(0,298,) [ 17.584000] yaffs_add_obj_to_dir: sum=0, obj_id=0, short_name=, parent:(0,2,) [ 17.588000] yaffs_add_obj_to_dir: sum=0, obj_id=624, short_name=, parent:(0,610,) [ 17.596000] yaffs_add_obj_to_dir: sum=0, obj_id=0, short_name=, parent:(0,2,)查看short_name相关代码后在set_obj_name()中加入打印,如下:
void yaffs_set_obj_name(yaffs_obj_t *obj, const YCHAR *name) { #ifdef CONFIG_YAFFS_SHORT_NAMES_IN_RAM memset(obj->short_name, 0, sizeof(YCHAR) * (YAFFS_SHORT_NAME_LENGTH+1)); if (name && yaffs_strnlen(name,YAFFS_SHORT_NAME_LENGTH+1) <= YAFFS_SHORT_NAME_LENGTH) yaffs_strcpy(obj->short_name, name); else obj->short_name[0] = _Y('\0'); #endif obj->sum = yaffs_calc_name_sum(name); T(YAFFS_TRACE_TRACING, (TSTR("yaffs_set_obj_name: sum=%u, obj_id=%d, hdr_chunk=%d, short_name=%s (%s)\n"), obj->sum, obj->obj_id, obj->hdr_chunk, obj->short_name, name?:"")); }加入打印后得到一个日志如下,正常情况下obj_id=764是指libdvm.so,但这里却是libmedia_jni.so,为何会如此??
[ 38.796000] yaffs_set_obj_name: sum=6787, obj_id=767, hdr_chunk=54162, short_name=librmomx.so (librmomx.so) [ 38.804000] yaffs_check_obj_details_loaded: chunkData=0x92c73000 [ 38.816000] yaffs_set_obj_name: sum=62483, obj_id=766, hdr_chunk=54156, short_name= (libstagefright_color_conversion.so) [ 38.824000] yaffs_check_obj_details_loaded: chunkData=0x92c73000 [ 38.840000] yaffs_set_obj_name: sum=11922, obj_id=765, hdr_chunk=54114, short_name=libmedia_jni.so (libmedia_jni.so) [ 38.848000] yaffs_check_obj_details_loaded: chunkData=0x92c73000 [ 38.860000] yaffs_set_obj_name: sum=11922, obj_id=764, hdr_chunk=53515, short_name=libmedia_jni.so (libmedia_jni.so) [ 38.868000] yaffs_check_obj_details_loaded: chunkData=0x92c73000 [ 38.884000] yaffs_set_obj_name: sum=4566, obj_id=763, hdr_chunk=53507, short_name=libETC1.so (libETC1.so) [ 38.892000] yaffs_check_obj_details_loaded: chunkData=0x92c73000 [ 38.904000] yaffs_set_obj_name: sum=7769, obj_id=762, hdr_chunk=53481, short_name=libskiagl.so (libskiagl.so) [ 38.912000] yaffs_check_obj_details_loaded: chunkData=0x92c73000 [ 38.920000] yaffs_set_obj_name: sum=18559, obj_id=761, hdr_chunk=53462, short_name= (libreference-ril.so) [ 38.928000] yaffs_check_obj_details_loaded: chunkData=0x92c73000查看代码中调用yaffs_set_obj_name的地方,发现yaffs_get_obj_name和yaffs_check_obj_details_loaded调用了yaffs_set_obj_name()
yaffs_get_obj_name中相关代码如下,与yaffs_check_obj_details_loaded类似,都是申请一块缓存,然后调用yaffs_rd_chunk_tags_nand(),不同的是yaffs_get_obj_name中多了memset(buffer, 0, obj->my_dev->data_bytes_per_chunk);
int result; __u8 *buffer = yaffs_get_temp_buffer(obj->my_dev, __LINE__); yaffs_obj_header *oh = (yaffs_obj_header *) buffer; memset(buffer, 0, obj->my_dev->data_bytes_per_chunk); if (obj->hdr_chunk > 0) { result = yaffs_rd_chunk_tags_nand(obj->my_dev, obj->hdr_chunk, buffer, NULL); } yaffs_load_name_from_oh(obj->my_dev,name,oh->name,buffer_size);回忆之前的日志,发现对于同一个文件,先调用yaffs_get_obj_name与先调用yaffs_check_obj_details_loaded效果是不同的。先调用yaffs_get_obj_name的,在find /system/lib/ -name lib*的时候能打印奇怪的信息(libui.so not find),而如果先调用yaffs_check_obj_details_loaded,则find的时候无异常打印,在ls -l时,发现完全一样的两个文件(大小,日期,名字)。全在于memset()对结果的影响。yaffs_rd_chunk_tags_nand的相关代码如下:
if (in->lazy_loaded && in->hdr_chunk > 0) { in->lazy_loaded = 0; chunkData = yaffs_get_temp_buffer(dev, __LINE__); T(YAFFS_TRACE_TRACING, (TSTR("yaffs_check_obj_details_loaded: chunkData=0x%p" TENDSTR),chunkData)); result = yaffs_rd_chunk_tags_nand(dev, in->hdr_chunk, chunkData, &tags); oh = (yaffs_obj_header *) chunkData;继续分析,根据日志看出问题出在yaffs_rd_chunk_tags_nand()函数内部,接下来运行 result = dev->param.read_chunk_tags_fn(dev, realignedChunkInNAND, buffer,tags);继续运行到nandmtd2_ReadChunkWithTagsFromNAND(yaffs_dev_t *dev, int nand_chunk,__u8 *data, yaffs_ext_tags *tags) ,修改此函数代码如下:
int nandmtd2_ReadChunkWithTagsFromNAND(yaffs_dev_t *dev, int nand_chunk, __u8 *data, yaffs_ext_tags *tags) { struct mtd_info *mtd = yaffs_dev_to_mtd(dev); #if (MTD_VERSION_CODE > MTD_VERSION(2, 6, 17)) struct mtd_oob_ops ops; #endif size_t dummy; int retval = 0; int localData = 0; int windsome_read_twice = 1; int windsome_path = 0; loff_t addr = ((loff_t) nand_chunk) * dev->param.total_bytes_per_chunk; yaffs_PackedTags2 pt; int packed_tags_size = dev->param.no_tags_ecc ? sizeof(pt.t) : sizeof(pt); void * packed_tags_ptr = dev->param.no_tags_ecc ? (void *) &pt.t: (void *)&pt; T(YAFFS_TRACE_MTD, (TSTR ("nandmtd2_ReadChunkWithTagsFromNAND chunk %d data %p tags %p" TENDSTR), nand_chunk, data, tags)); if (dev->param.inband_tags) { if (!data) { localData = 1; data = yaffs_get_temp_buffer(dev, __LINE__); } } #if (LINUX_VERSION_CODE > KERNEL_VERSION(2, 6, 17)) windsome_read_twice_pos: data[0]='w';data[1]='\5';data[2]='n';data[3]='d';data[4]='s';data[5]='o';data[6]='m';data[7]='e'; if (dev->param.inband_tags || (data && !tags)) { retval = mtd->read(mtd, addr, dev->param.total_bytes_per_chunk, &dummy, data); windsome_path = 0; } else if (tags) { ops.mode = MTD_OOB_AUTO; ops.ooblen = packed_tags_size; ops.len = data ? dev->data_bytes_per_chunk : packed_tags_size; ops.ooboffs = 0; ops.datbuf = data; ops.oobbuf = yaffs_dev_to_lc(dev)->spareBuffer; retval = mtd->read_oob(mtd, addr, &ops); windsome_path = 1; } if (data[0]=='w' && data[1]=='\5' && data[2]=='n' && data[3]=='d' && data[4]=='s' && data[5]=='o' && data[6]=='m' && data[7]=='e') { printk ("windsome%d: %d mtd read error! why???? len=%d, dummy_len=%d, nand_chunk=%d, addr=0x%llx, retval=%d\n", windsome_path, windsome_read_twice, dev->param.total_bytes_per_chunk, dummy, nand_chunk, addr, retval); if (windsome_read_twice > 0) { printk ("windsome: try again!\n"); windsome_read_twice--; goto windsome_read_twice_pos; } } #else if (!dev->param.inband_tags && data && tags) { retval = mtd->read_ecc(mtd, addr, dev->data_bytes_per_chunk, &dummy, data, dev->spareBuffer, NULL); } else { if (data) retval = mtd->read(mtd, addr, dev->data_bytes_per_chunk, &dummy, data); if (!dev->param.inband_tags && tags) retval = mtd->read_oob(mtd, addr, mtd->oobsize, &dummy, dev->spareBuffer); } #endif if (dev->param.inband_tags) { if (tags) { yaffs_PackedTags2TagsPart *pt2tp; pt2tp = (yaffs_PackedTags2TagsPart *)&data[dev->data_bytes_per_chunk]; yaffs_unpack_tags2tags_part(tags, pt2tp); } } else { if (tags) { memcpy(packed_tags_ptr, yaffs_dev_to_lc(dev)->spareBuffer, packed_tags_size); yaffs_unpack_tags2(tags, &pt, !dev->param.no_tags_ecc); } } if (localData) yaffs_release_temp_buffer(dev, data, __LINE__); if (tags && retval == -EBADMSG && tags->ecc_result == YAFFS_ECC_RESULT_NO_ERROR) { tags->ecc_result = YAFFS_ECC_RESULT_UNFIXED; dev->n_ecc_unfixed++; } if(tags && retval == -EUCLEAN && tags->ecc_result == YAFFS_ECC_RESULT_NO_ERROR) { tags->ecc_result = YAFFS_ECC_RESULT_FIXED; dev->n_ecc_fixed++; } if (retval == 0) return YAFFS_OK; else { T(YAFFS_TRACE_TRACING, (TSTR("nandmtd2_ReadChunkWithTagsFromNAND:retval=%d\n"), retval)); return YAFFS_FAIL; } }根据此函数的日志,发现确实有mtd read error!的打印,重新读一次能够正常。
printk ("windsome%d: %d mtd read error! why???? len=%d, dummy_len=%d, nand_chunk=%d, addr=0x%llx, retval=%d\n",
windsome_path, windsome_read_twice, dev->param.total_bytes_per_chunk, dummy, nand_chunk, addr, retval);
根据如上调试信息,通过修改nandmtd2_ReadChunkWithTagsFromNAND函数暂时解决此问题。
可能有另外的解决办法,修改yaffs_rd_chunk_tags_nand,在里面增加memset(),可以试试。
最终的原因,我并没有查出来,是板子稳定性问题导致驱动不正常?或者smp8xxx_nand驱动本身有问题,有待继续调试。