分析linux下的native内存泄露

分析linux上的native内存泄露

Running native profiling on Linux

Read the information provided

Introduction

Several memory profilers are available for the Linux operating system, which fit into the following categories:

Preprocessor level
These profilers require a header to be compiled with the source under test. You can recompile your own JNI libraries with one of these tools to track a native memory leak in your code. Unless you have the source code for Java itself, this recompilation cannot find a leak in the JVM. Compiling this kind of tool into a large project like a JVM would almost certainly be difficult and time-consuming. Dmalloc is an example of this kind of tool.
Linker level
These profilers require the binaries under test to be relinked with a debugging library. You can relink individual JNI libraries but it is not recommended for entire Java runtimes because the runtime vendor is unlikely to support the use of modified binaries. Ccmalloc is an example of this kind of tool.
Runtime-linker level
These profilers use the LD_PRELOAD environment variable to preload a library that replaces the standard memory routines with instrumented versions. These profilers do not require recompilation or relinking of source code but many of them do not work well with Java runtimes. A Java runtime is a complicated system that can use memory and threads in unusual ways. It is worth experimenting with a few to see if they work in your scenario. NJAMD is an example of this kind of tool.
Emulator-based
Valgrind is the only example of this type of memory debugger. It emulates the underlying processor in a similar way to the way that a Java runtime emulates the Java virtual machine. It is possible to run Java under Valgrind, but the heavy performance impact (25 - 50 times slower) makes it very difficult to run large, complicated Java applications in this manner. Valgrind is currently available on Linux x86, AMD64, PPC32, and PPC64. If you are intend to use Valgrind, try to narrow down the problem to the smallest possible testcase you can before using it.

For simple scenarios that can tolerate the performance overhead, Valgrind is the most simple and user-friendly of the available free tools. Valgrind can provide a full stack trace for code paths that are leaking memory.

Running a process in Valgrind

Because Valgrind is an emulation environment, the process must be run under Valgrind, and the analysis finishes when the process ends. To use Valgrind with a process that usually runs indefinitely, you must be able to stop the process.

Some Java runtimes use thread stacks and processor registers in unusual ways. This usage can confuse some debugging tools, which expect native programs to abide by standard conventions of register use and stack structure. When using Valgrind to debug leaking JNI applications, you might see many warnings about the use of memory and some unusual thread stacks. These effects are caused by the way that the Java runtime structures its data internally and they are not a problem.

To trace the LeakyJNIApp with the Valgrind memcheck tool, use this command:

   valgrind --trace-children=yes --leak-check=full java -Djava.library.path=. com.ibm.jtc.demos.LeakyJNIApp 10 

The --trace-children=yes option makes Valgrind trace any processes that are started by the Java launcher. Some versions of the Java launcher restart themselves after setting environment variables to change behavior. If you do not specify --trace-children, you might not trace the Java runtime.

The --leak-check=full option requests that full stack traces of leaking areas of code are printed at the end of the run, instead of a summary of the state of the memory.

Valgrind prints many warnings and errors while the command runs, most of which are not relevant in this context. When your process ends, it prints a list of leaking call stacks in ascending order of amount of memory leaked.

This example shows the end of the summary section of the Valgrind output for LeakyJNIApp on Linux x86:

   ==20494== 8,192 bytes in 8 blocks are possibly lost in loss record 36 of 45 
    ==20494==    at 0x4024AB8: malloc (vg_replace_malloc.c:207) 
    ==20494==    by 0x460E49D: Java_com_ibm_jtc_demos_LeakyJNIApp_nativeMethod 
    (in /home/andhall/LeakyJNIApp/libleakyjniapp.so) 
    ==20494==    by 0x535CF56: ??? 
    ==20494==    by 0x46423CB: gpProtectedRunCallInMethod 
    (in /usr/local/ibm-java2-i386-50/jre/bin/libj9vm23.so) 
    ==20494==    by 0x46441CF: signalProtectAndRunGlue 
    (in /usr/local/ibm-java2-i386-50/jre/bin/libj9vm23.so) 
    ==20494==    by 0x467E0D1: j9sig_protect 
    (in /usr/local/ibm-java2-i386-50/jre/bin/libj9prt23.so) 
    ==20494==    by 0x46425FD: gpProtectAndRun 
    (in /usr/local/ibm-java2-i386-50/jre/bin/libj9vm23.so) 
    ==20494==    by 0x4642A33: gpCheckCallin 
    (in /usr/local/ibm-java2-i386-50/jre/bin/libj9vm23.so) 
    ==20494==    by 0x464184C: callStaticVoidMethod 
    (in /usr/local/ibm-java2-i386-50/jre/bin/libj9vm23.so) 
    ==20494==    by 0x80499D3: main 
    (in /usr/local/ibm-java2-i386-50/jre/bin/java) 
    ==20494== 
    ==20494== 
    ==20494== 65,536 (63,488 direct, 2,048 indirect) bytes in 62 blocks are definitely 
    lost in loss record 42 of 45 
    ==20494==    at 0x4024AB8: malloc (vg_replace_malloc.c:207) 
    ==20494==    by 0x460E49D: Java_com_ibm_jtc_demos_LeakyJNIApp_nativeMethod 
    (in /home/andhall/LeakyJNIApp/libleakyjniapp.so) 
    ==20494==    by 0x535CF56: ??? 
    ==20494==    by 0x46423CB: gpProtectedRunCallInMethod 
    (in /usr/local/ibm-java2-i386-50/jre/bin/libj9vm23.so) 
    ==20494==    by 0x46441CF: signalProtectAndRunGlue 
    (in /usr/local/ibm-java2-i386-50/jre/bin/libj9vm23.so) 
    ==20494==    by 0x467E0D1: j9sig_protect 
    (in /usr/local/ibm-java2-i386-50/jre/bin/libj9prt23.so) 
    ==20494==    by 0x46425FD: gpProtectAndRun 
    (in /usr/local/ibm-java2-i386-50/jre/bin/libj9vm23.so) 
    ==20494==    by 0x4642A33: gpCheckCallin 
    (in /usr/local/ibm-java2-i386-50/jre/bin/libj9vm23.so) 
    ==20494==    by 0x464184C: callStaticVoidMethod 
    (in /usr/local/ibm-java2-i386-50/jre/bin/libj9vm23.so) 
    ==20494==    by 0x80499D3: main 
    (in /usr/local/ibm-java2-i386-50/jre/bin/java) 
    ==20494== 
    ==20494== LEAK SUMMARY: 
    ==20494==    definitely lost: 63,957 bytes in 69 blocks. 
    ==20494==    indirectly lost: 2,168 bytes in 12 blocks. 
    ==20494==      possibly lost: 8,600 bytes in 11 blocks. 
    ==20494==    still reachable: 5,156,340 bytes in 980 blocks. 
    ==20494==         suppressed: 0 bytes in 0 blocks. 
    ==20494== Reachable blocks (those to which a pointer was found) are not shown. 
    ==20494== To see them, rerun with: --leak-check=full --show-reachable=yes 
    

The second line of the stacks shows that the memory was leaked by the com.ibm.jtc.demos.LeakyJNIApp.nativeMethod() method.

Read the information provided

Identify the native heap leak owner