生成数据库的随机ID

问题描述:

I have a requirement on a project where

  • I need to generate unique ID's.
  • ID's must be upper case.
  • I cannot check database to see if ID has been used previously.

We expect to have many millions of records added to database every month.

I have tried solutions here: PHP: How to generate a random, unique, alphanumeric string? and while they seem to work at first, my testing has shown there would be duplicates over time.

Now I am looking at using uniqid with a prefix. The problem I found using uniqid without a prefix is that duplicates will be generated when simultaneous requests come into server at the same exact time. I am hoping using a prefix would solve this.
I am thinking of using this function:

private function generate_id()
{
    $alpha_numeric = 'ABCDEFGHIJKLMNPQRSTUVWXYZ0123456789';
    $max = strlen($alpha_numeric);
    $prefix = '';

    for ($i = 0; $i < 5; $i++)
    {
        $prefix .= $alpha_numeric[random_int(0, $max - 1)];
    }
    return strtoupper(uniqid($prefix));
}

The prefix would be a 5 character alphanumeric string. Would this be enough to satisfy my requirements?

*****Edit*****

Using a UUID as suggested would be the best way to limit the chance of collision but it has been decided to go with the approach above but increase the prefix to 7 characters. The chance of a collision if two ID's where generated at the same millisecond would be around 1 in 8.3 million. That has been deemed acceptable by the higher ups.

我对项目有一个要求 p>

  • 我需要生成唯一的ID。 li>
  • ID必须大写。 li>
  • 我无法检查数据库以查看以前是否使用过ID。 li> ul>

    我们希望每个月都有数百万条记录添加到数据库中。 p>

    我在这里尝试过解决方案:PHP:如何生成一个随机的,唯一的,字母数字字符串? 而它们似乎 一开始工作,我的测试显示随着时间的推移会有重复。 p>

    现在我正在考虑使用带有前缀的uniqid。 我在没有前缀的情况下使用uniqid发现的问题是,当同时请求在同一时间进入服务器时,将生成重复项。 我希望使用前缀可以解决这个问题。
    我正在考虑使用这个函数: p>

      private function generate_id()
     {
     $ alpha_numeric ='  ABCDEFGHIJKLMNPQRSTUVWXYZ0123456789'; 
     $ max = strlen($ alpha_numeric); 
     $ prefix =''; 
     
     for($ i = 0; $ i&lt; 5; $ i ++)
     {
     $前缀。  = $ alpha_numeric [random_int(0,$ max  -  1)]; 
    } 
    返回strtoupper(uniqid($ prefix)); 
    } 
      code>  pre> 
     
     

    前缀是5个字符的字母数字字符串。 这是否足以满足我的要求? p>

    *****编辑***** p>

    按建议使用UUID将是 限制碰撞机会的最佳方法,但已决定采用上述方法,但将前缀增加到7个字符。 如果两个ID在同一毫秒内产生的碰撞几率大约为883万分之一。 高层人士认为这是可以接受的。 p> div>

If you use Composer or external libraries see https://github.com/ramsey/uuid

or this function may meet your needs. For your needs strtoupper the result:

/**
 * generate
 *
 * Returns a version 4 UUID
 *
 * @access public
 * @return string
 */
public static function generate()
{
    $data = openssl_random_pseudo_bytes(16);

    $data[6] = chr(ord($data[6]) & 0x0f | 0x40); // set version to 0100
    $data[8] = chr(ord($data[8]) & 0x3f | 0x80); // set bits 6-7 to 10

    return vsprintf('%s%s-%s-%s-%s-%s%s%s', str_split(bin2hex($data), 4));
}

See https://en.wikipedia.org/wiki/Universally_unique_identifier#Version_4_(random)

Have you considered using an Unique key in the database to enforce uniqueness? In which case you won't have to check for duplicates yourself, but will generate unique value and attempt to insert the record in the DB until you succeed.

If MySQL then read this - Using MySQL UNIQUE Index To Prevent Duplicates. If not - look up the documentation of your database of choice.

uniquid does not guarantee uniqueness of return value! Use the function with more_entropy set to TRUE to increase chances of unique value.

return strtoupper(uniqid($prefix), true);

Is in absolutely necessary to limit yourself to only uppercase letters and numbers? This will reduce the maximum number of unique values generated from the function opposed to using uppercase, lowercase, numbers and symbols.

You can also consider cryptographic functions to increase randomness.

If you are using PHP7 take a look at http://php.net/manual/en/function.random-bytes.php

e.g.

<?php
echo strtoupper(bin2hex(random_bytes(32)));
?>

Should be unique enough for your requirements, use more bytes if you feel you need to.

Generally speaking - there will always be possible duplicates when you can't check the database for existing values. All you can do is to reduce probability of duplicates to be low enough for your use case. This is idea behind GUID.

If you really can't access the database and if you are really limited to upper-case characters then I would recommend generating GUID with uniqid function, then removing characters you don't want and converting to uppercase. If you are afraid that duplicates might occur, concatenate two or more GUIDs to reduce this probability.

Something like:

$unique_string = str_replace(".", "", strtoupper(uniqid(uniqid(uniqid(), true), true)));