boost:pool 库速记

boost::pool 库速记

使用示例

#include <functional>
#include <iostream>

#include <boost/pool/pool.hpp>
#include <boost/pool/object_pool.hpp>

namespace xy{
    using namespace std;

    class A{
    public:
        A(){
            cout<<"Construct: "<<endl;
        }
        A(int a){
            cout<<"Construct: "<<a<<endl;
        }
        ~A(){
            cout<<"Destruct"<<endl;
        }
    };

    function< void(void) > f = [](){
        boost::pool< > p{ sizeof(char) };
        void *ptr = p.ordered_malloc(sizeof(A));
        A* a = new (ptr)A();
        a->~A();
    };
    function<void(void)> f_object = [](){
        boost::object_pool< A > p{};
        A *ptr = p.construct(1);
    };
}
int main()
{
    xy::f_object();
}

boost::pool

  • 构造函数指定大小Chunk_Size,但如果作为通用内存池,设置多大没关系的,一般设置为1字节/sizeof(char)便于分配不同类型/大小的空间(boost::pool内部有对齐机制)。
  • 通过malloc()返回一个构造函数指定大小(Chunk_Size)的块。通过ordered_malloc(n)返回(n*Chunk_Size)大小的块。因此,如果你构造时设置Chunk_Size=1,分配一个类A的对象内存时只需要ordered_malloc(sizeof(A)).如果Chunk_Size>1,分配一个类A的对象内存就需要ordered_malloc(sizeof(A)/Chunk_Size+1)
  • 通过boost::pool来分配不同大小的内存时,需要自己控制使用placement new来调用构造函数 即A* a = new (ptr)A();,显式调用析构函数即a->~A();来析构对象。
  • 通过显式调用free(void * chunk);ordered_free(void * chunk);来返回内存给内存池。当然,不反回给内存池也可以,内存池对象析构时会把内存回收给系统。内存池析构时并不会调用上面的对象的析构函数,因此你总是需要手动调用对象的析构
  • 但一般并不这样用(不从pool上申请不同大小的内存),因为不同大小的对象在该内存上用一段时间后,又会造成内存碎片。因此大多使用都是用来构造相同大小的对象类型。

boost::object_pool

  • f_object中所使用的boost::object_pool<T>是指这个内存池中放置的全部都是T类型的对象,如果你通过malloc()来返回一个内存指针,对象的构造函数并未被调用,而通过p.construct(..)来返回的指针是调用了构造函数的。
  • 通过free(T * p);返回内存给内存池时,会调用T的析构函数。
    内存池析构时总是会调用每个对象的析构函数。

boost::pool实现

class simple_segregated_storage

类simple_segregated_storage可以说是boost::pool内部数据结构的核心。它主要是维护了一个“链表”。之所以有引号,是因为内部并没有使用std::list这样的数据结构(嫌弃慢,boost::pool作者一脸傲娇),而是自己模拟了一个链表,如下:

/*
该函数是simple_segregated_storage的成员函数。第一次看到一下懵逼了,不知其何用意。难道不就是得到 *ptr 的功能吗?!
事实是,对于一个void*是不能dereference的。因为*ptr你将得到一个void类型,C++不允许void类型。
*/
static void * & nextof(void * const ptr)
{
      return *(static_cast<void **>(ptr));
}

boost:pool 库速记

如上图所示,A块和B块可以是连续的,也可能是不连续的内存块。每个内存块的前4字节保存着下一内存块的起始地址或指向0(即最后一个内存块)。

下段代码是向simple_segregated_storage里大块内存添加:(是的,只有add_block没有remove_block,也就是说,一旦内存块添加到simple_segregated_storage里,就会在simple_segregated_storage销毁之前,一直由该simple_segregated_storage管理。)

//segregate会把给的一个sz大小的内存块block,拆分为每个partition_sz大小的chunk单元,即上图中chunk A和B的大小为partition_sz,
//每个chunk的前4字节指向下一个内存块(作为链表的next),称为chunk头,而最后一个chunk头指向end。

template <typename SizeType>
void * simple_segregated_storage<SizeType>::segregate(
    void * const block,
    const size_type sz,
    const size_type partition_sz,
    void * const end)
{
  char * old = static_cast<char *>(block)
      + ((sz - partition_sz) / partition_sz) * partition_sz;

  nextof(old) = end;

  if (old == block)
    return block;

  for (char * iter = old - partition_sz; iter != block;
      old = iter, iter -= partition_sz)
    nextof(iter) = old;

  nextof(block) = old;

  return block;
}
//添加一个block时,会把这该块分解成chunk,添加到链表的头部。因为无序,所以复杂度O(1)
void add_block(void * const block,
        const size_type nsz, const size_type npartition_sz)
    {
      BOOST_POOL_VALIDATE_INTERNALS
      first = segregate(block, nsz, npartition_sz, first);
      BOOST_POOL_VALIDATE_INTERNALS
    }
//有序添加一个block,先把这该块分解成多个chunk,再根据std::greater对指针排序,找到这个block对应的位置,然后添加进去。复杂度O(n)
void add_ordered_block(void * const block,
        const size_type nsz, const size_type npartition_sz)
    {
      if (loc == 0)
        add_block(block, nsz, npartition_sz);
      else
        nextof(loc) = segregate(block, nsz, npartition_sz, nextof(loc));
      BOOST_POOL_VALIDATE_INTERNALS
    }

//这个没什么好说的,通过比较地址,找到地址在链表中的位置,以用来排序。
template <typename SizeType>
void * simple_segregated_storage<SizeType>::find_prev(void * const ptr)
{
  if (first == 0 || std::greater<void *>()(first, ptr))
    return 0;

  void * iter = first;
  while (true)
  {
    if (nextof(iter) == 0 || std::greater<void *>()(nextof(iter), ptr))
      return iter;

    iter = nextof(iter);
  }
}
//simple_segregated_storage成员变量。 链表头指针。
void * first;

下段代码从simple_segregated_storage链表中获取内存:

template <typename SizeType>
void * simple_segregated_storage<SizeType>::malloc_n(const size_type n,
    const size_type partition_size)
{
   BOOST_POOL_VALIDATE_INTERNALS
  if(n == 0)
    return 0;
  void * start = &first;
  void * iter;
  do
  {
    if (nextof(start) == 0)
      return 0;
    iter = try_malloc_n(start, n, partition_size);
  } while (iter == 0);
  //此处返回内存chunk头
  void * const ret = nextof(start);
  //此处是经典的单向链表移除其中一个节点的操作。把该内存的前面chunk头指向该内存尾部chunk头指向的内存。即把该部分排除出链表。
  nextof(start) = nextof(iter);
  BOOST_POOL_VALIDATE_INTERNALS
  return ret;
}

//start会指向满足条件(连续的n个partition_size大小的chunk内存)的chunk头部,返回最后一个chunk指针。
template <typename SizeType>
void * simple_segregated_storage<SizeType>::try_malloc_n(
    void * & start, size_type n, const size_type partition_size)
{
  void * iter = nextof(start);
  //判断start开始的块是否是连续的n块partition_size大小的内存
  while (--n != 0)
  {
    void * next = nextof(iter);
    //如果next != static_cast<char *>(iter) + partition_size,说明下一块chunk被占用或是到了大块内存(block)的尾部。
    if (next != static_cast<char *>(iter) + partition_size)
    {
      // next == 0 (end-of-list) or non-contiguous chunk found
      start = iter;
      return 0;
    }
    iter = next;
  }
  return iter;
}

class PODptr

boost:pool 库速记

如上图,类PODptr指示了一个block结构,这个block大小不一定相同,但都由 chunk data+ next ptr + next block size三部分组成。

  • chunk data部分被构造成一个simple_segregated_storage,切分为多个chunk,是一块连续的内存
  • next ptr 指向下一个block结构,next block size指出了下一个block结构的大小。
  • 也就是说,多个PODptr结构组成一个链表,而PODptr内部由simple_segregated_storage分成一个顺序表。
  • PODptr的大小不固定,增长方式见void * pool<UserAllocator>::malloc_need_resize().

class pool

//pool 从simple_segregated_storage派生
template <typename UserAllocator>
class pool: protected simple_segregated_storage < typename UserAllocator::size_type >;

//返回父类指针以便调用父类函数
simple_segregated_storage<size_type> & store()
{ //! \returns pointer to store.
      return *this;
}

在调用pool::malloc只申请一个chunk时,如果有足够空间,使用父类指针调用malloc返回内存,否则就重新申请一个大block。代码简单,就不贴了。

下面代码是申请n个连续的chunk。如果没有连续的n个内存就需要重新分配内存了。分配好的内存,通过add_ordered_block添加到chunks的有序链表,并通过地址大小把刚申请的block放到PODptr链表的排序位置。

template <typename UserAllocator>
void * pool<UserAllocator>::ordered_malloc(const size_type n)
{ //! Gets address of a chunk n, allocating new memory if not already available.
  //! \returns Address of chunk n if allocated ok.
  //! \returns 0 if not enough memory for n chunks.

  const size_type partition_size = alloc_size();
  const size_type total_req_size = n * requested_size;
  const size_type num_chunks = total_req_size / partition_size +
      ((total_req_size % partition_size) ? true : false);

  void * ret = store().malloc_n(num_chunks, partition_size);

#ifdef BOOST_POOL_INSTRUMENT
  std::cout << "Allocating " << n << " chunks from pool of size " << partition_size << std::endl;
#endif
  if ((ret != 0) || (n == 0))
    return ret;

#ifdef BOOST_POOL_INSTRUMENT
  std::cout << "Cache miss, allocating another chunk...\n";
#endif

  // Not enough memory in our storages; make a new storage,
  BOOST_USING_STD_MAX();

  //计算下次申请内存的大小,基本就是乘以2.integer::static_lcm是求最小公倍数。
  //记得进巨人网络之前,郝大叔问到我STL vector内存增长的实现,当时我正看Java list的代码,把Java的增长和C++的搞混了,记不清楚是2倍还是1.5倍。
  //话说郝大叔还是个很不错的人,后来我由于某些原因离开,还是挺可惜,否则能在他那学习很多东西。
  //记得有次网上还有人在争论究竟是Java的增长算法好还是STL的好。恕我直言,纠结这个好傻,这个重要吗?
  next_size = max BOOST_PREVENT_MACRO_SUBSTITUTION(next_size, num_chunks);
  size_type POD_size = static_cast<size_type>(next_size * partition_size +
      integer::static_lcm<sizeof(size_type), sizeof(void *)>::value + sizeof(size_type));
  char * ptr = (UserAllocator::malloc)(POD_size);
  if (ptr == 0)
  {
     if(num_chunks < next_size)
     {
        // Try again with just enough memory to do the job, or at least whatever we
        // allocated last time:
        next_size >>= 1;
        next_size = max BOOST_PREVENT_MACRO_SUBSTITUTION(next_size, num_chunks);
        POD_size = static_cast<size_type>(next_size * partition_size +
            integer::static_lcm<sizeof(size_type), sizeof(void *)>::value + sizeof(size_type));
        ptr = (UserAllocator::malloc)(POD_size);
     }
     if(ptr == 0)
       return 0;
  }
  const details::PODptr<size_type> node(ptr, POD_size);

  // Split up block so we can use what wasn't requested.
  if (next_size > num_chunks)
    store().add_ordered_block(node.begin() + num_chunks * partition_size,
        node.element_size() - num_chunks * partition_size, partition_size);

  BOOST_USING_STD_MIN();
  if(!max_size)
    next_size <<= 1;
  else if( next_size*partition_size/requested_size < max_size)
    next_size = min BOOST_PREVENT_MACRO_SUBSTITUTION(next_size << 1, max_size*requested_size/ partition_size);

  //  insert it into the list,
  //   handle border case.
  //对大块block进行排序
  if (!list.valid() || std::greater<void *>()(list.begin(), node.begin()))
  {
    node.next(list);
    list = node;
  }
  else
  {
    details::PODptr<size_type> prev = list;

    while (true)
    {
      // if we're about to hit the end, or if we've found where "node" goes.
      if (prev.next_ptr() == 0
          || std::greater<void *>()(prev.next_ptr(), node.begin()))
        break;

      prev = prev.next();
    }

    node.next(prev.next());
    prev.next(node);
  }

  //  and return it.
  return node.begin();
}

下面代码是释放未被占用的块。(一个block任何一个chunk被占用就不会释放)

template <typename UserAllocator>
bool pool<UserAllocator>::release_memory()
{ //! pool must be ordered. Frees every memory block that doesn't have any allocated chunks.
  //! \returns true if at least one memory block was freed.

  // ret is the return value: it will be set to true when we actually call
  //  UserAllocator::free(..)
  bool ret = false;

  // This is a current & previous iterator pair over the memory block list
  details::PODptr<size_type> ptr = list;
  details::PODptr<size_type> prev;

  // This is a current & previous iterator pair over the free memory chunk list
  //  Note that "prev_free" in this case does NOT point to the previous memory
  //  chunk in the free list, but rather the last free memory chunk before the
  //  current block.
  void * free_p = this->first;
  void * prev_free_p = 0;

  const size_type partition_size = alloc_size();

  // Search through all the all the allocated memory blocks
  while (ptr.valid())
  {
    // At this point:
    //  ptr points to a valid memory block
    //  free_p points to either:
    //    0 if there are no more free chunks
    //    the first free chunk in this or some next memory block
    //  prev_free_p points to either:
    //    the last free chunk in some previous memory block
    //    0 if there is no such free chunk
    //  prev is either:
    //    the PODptr whose next() is ptr
    //    !valid() if there is no such PODptr

    // If there are no more free memory chunks, then every remaining
    //  block is allocated out to its fullest capacity, and we can't
    //  release any more memory
    if (free_p == 0)
      break;

    // We have to check all the chunks.  If they are *all* free (i.e., present
    //  in the free list), then we can free the block.
    bool all_chunks_free = true;

    // Iterate 'i' through all chunks in the memory block
    // if free starts in the memory block, be careful to keep it there
    void * saved_free = free_p;
    for (char * i = ptr.begin(); i != ptr.end(); i += partition_size)
    {
      // If this chunk is not free
      if (i != free_p)
      {
        // We won't be able to free this block
        all_chunks_free = false;

        // free_p might have travelled outside ptr
        free_p = saved_free;
        // Abort searching the chunks; we won't be able to free this
        //  block because a chunk is not free.
        break;
      }

      // We do not increment prev_free_p because we are in the same block
      free_p = nextof(free_p);
    }

    // post: if the memory block has any chunks, free_p points to one of them
    // otherwise, our assertions above are still valid

    const details::PODptr<size_type> next = ptr.next();

    if (!all_chunks_free)
    {
      if (is_from(free_p, ptr.begin(), ptr.element_size()))
      {
        std::less<void *> lt;
        void * const end = ptr.end();
        do
        {
          prev_free_p = free_p;
          free_p = nextof(free_p);
        } while (free_p && lt(free_p, end));
      }
      // This invariant is now restored:
      //     free_p points to the first free chunk in some next memory block, or
      //       0 if there is no such chunk.
      //     prev_free_p points to the last free chunk in this memory block.

      // We are just about to advance ptr.  Maintain the invariant:
      // prev is the PODptr whose next() is ptr, or !valid()
      // if there is no such PODptr
      prev = ptr;
    }
    else
    {
      // All chunks from this block are free

      // Remove block from list
      if (prev.valid())
        prev.next(next);
      else
        list = next;

      // Remove all entries in the free list from this block
      //关键点在这里,释放了一个block之后,会把上一个chunk头修改。
      if (prev_free_p != 0)
        nextof(prev_free_p) = free_p;
      else
        this->first = free_p;

      // And release memory
      (UserAllocator::free)(ptr.begin());
      ret = true;
    }

    // Increment ptr
    ptr = next;
  }

  next_size = start_size;
  return ret;
}

pool总结

pool的实现基本就是利用simple_segregated_storage内部实现的维护chunk的链表来实现内存管理的。simple_segregated_storage可以说是pool的核心。但pool内部一共维护了两个链表:

  • simple_segregated_storage内部的chunk链表。分配单个chunk时,直接从这个链表拿一个chunk,复杂度O(1)。
  • pool内部有个成员变量details::PODptr<size_type> list;用来维护一个大块内存block的链表。可以知道,一个block内部是连续的,但block之间可以认为是不连续的内存。这个链表相当于一个内存地址索引,主要是为了提高查找效率:对于有序排列的内存池,归还内存时,用来快速判断是属于哪个块的。如果没有这个链表,就需要挨个chunk去判断地址大小。

class object_pool

object_pool实现

class object_pool: protected pool<UserAllocator>;

object_pool继承自pool,但和pool的区别是,pool用于申请固定大小的内存,而object_pool用于申请固定类型的内存,并会调用构造函数和析构函数。主要的函数就两个:

调用构造函数,用到了一个placement new的方式,也是老生常谈了。
唯一值得吐槽的是调用非默认构造函数的版本实现,pool自动生成了一个pool_construct.ipp的文件,用于构造函数带参数的类型。当时我在想,为什么不用std::forward,原来std::forward是C++11才进标准的。那为什么不用boost::smart_ptr的boost::detail::sp_forward,原来smart_ptr是boost1.23.0添加的,而pool是1.21.0的老前辈。哎,前人栽树后人乘凉,后人乘的凉都是前人晒得烈日啊!!!

elem``ent_type * construct(Arg1&, ... ArgN&){...}
element_type * construct()
{
  element_type * const ret = (malloc)();
  if (ret == 0)
    return ret;
  try { new (ret) element_type(); }
  catch (...) { (free)(ret); throw; }
  return ret;
}

destroy显式调用析构函数去析构,然后把内存还给链表维护。

void destroy(element_type * const chunk)
{
  chunk->~T();
  (free)(chunk);
}
void free BOOST_PREVENT_MACRO_SUBSTITUTION(element_type * const chunk)
{
  store().ordered_free(chunk);
}