Generation of random numbers is a requirement in some of the applications for example: in finance, lottery or testing systems. For example, generation of unique random confirmation numbers upon doing a transaction, generation of a password string of fixed size in the case of lottery systems. At times, it is even required to sort in a random order – some of the applications that do online tests for applicants for a position have such a requirement wherein they require random set of questions to be given to the end users out of a pool of questions so that no two people taking the same tests have the same set of questions. Random number logic is also helpful if you want to quickly create some dummy test data.
So, how can we generate random numbers in RDBMS?
In Oracle, you can use the DBMS_RANDOM package which will help you to generate random data in character, numeric or alphanumeric formats. The size and the range of the values from which you can pick up the random values can also be specified. In case, your database is missing this package, this gets created by the dbmsrand.sql script that is available under the rdbms/admin directory under your Oracle home folder. This package has a set of functions available…here are some of the key ones
a) RANDOM: As the name itself suggests, it creates random numbers,
b) VALUE: You can provide a range for it to generate a random value for you within that range. If you do not provide a range, the range is assumed to be between 0 and 1.
c) STRING: This takes in two parameters – the first one is a string and can be U (for upper case), L (for lower case), A (for alphanumeric), X (for alphanumeric with upper case alphabets) or P (for printable characters). The second parameter is a number which specifies the length of this random string.
Some examples to demonstrate it’s usage:
A random number between 0 and 1:
SQL> select dbms_random.value from dual;
VALUE
_____________
.506991713
A random number from a range, between 1 to 1000:
SQL> select dbms_random.value(1,1000) num from dual;
NUM
_____________
448.511772
A 12 digit random number:
SQL> select dbms_random.value(100000000000, 999999999999) num from dual;
NUM
_____________
2.4232E+11
In SQL Server also, there is a RAND() function that takes in a seed value of an integer but returns back a float value between 0 and 1. There are ways to seed it appropriately so that we can always generate a random number. However, using RAND() does not always guarantee a generation of a UNIQUE random number even if you seed . It can very well repeat itself since the most common way of doing this is by using the DATEPART() function to seed it with a value. Here is an example:
SELECT RAND( (DATEPART(yy, GETDATE()))
+ (DATEPART(dy, GETDATE()))
+ (DATEPART(hh, GETDATE()))
+ (DATEPART(mi, GETDATE()))
+ (DATEPART(ss, GETDATE()))
+ DATEPART(ms, GETDATE()))
* 1000000000000000000
This will generate a random 18 digit number for you.
If you do not need the numbers to be unique all the time, then this will work fine for you. However, there is another problem with using this kind of a scheme. If you use this in a select statement, you will get the same random value within that batch since the expression will get evaluated only once and not per row. Example:
select RAND( (DATEPART(yy, GETDATE()))
+ (DATEPART(dy, GETDATE()))
+ (DATEPART(hh, GETDATE()))
+ (DATEPART(mi, GETDATE()))
+ (DATEPART(ss, GETDATE()))
+ DATEPART(ms, GETDATE()))
* 1000000000000000000 as RAND_NBR, * from sysobjects
will generate the same random number for all the records returned from sysobjects. One can by-pass that issue within a batch by using the new row_number() function (SQL Server 2005) and ordering by a unique combination of columns in the select list, example:
select RAND( row_number() over (order by id) +
(DATEPART(yy, GETDATE()))
+ (DATEPART(dy, GETDATE()))
+ (DATEPART(hh, GETDATE()))
+ (DATEPART(mi, GETDATE()))
+ (DATEPART(ss, GETDATE()))
+ DATEPART(ms, GETDATE()))
* 1000000000000000000 as RAND_NBR, * from sysobjects
If the execution of the two batches are going to be further apart, this scheme can still give you wide enough distribution but still does not guarantee that those numbers will always be UNIQUE.
If you do not need the random number to be a number, you can just use the NEWID() function. It will generate an a uniqueidentifier data-type value for you and it will be unique and can be used in a select statement as well (you will need to do manipulates using replicate(), replace() or other functions if you always went them to be a certain length):
select newid() as string_rand, * from sysobjects
Recently, we ran into a requirement at one of our client sites where the requirement posed to us was that for a particular process, a unique 18 digit NUMBER needed to be generated. The reason for doing this was so that one cannot guess the numbers since if I as an end user gets this number, I should not be able to add 1 to it and go after someone else’s instance of a transaction (this was related to a finance industry). This number would then be used to tie up the data in two different systems with other set of attributes in the system.
Here is what we did. Let’s create a table t to demonstrate the code:
create table t (col1 bigint not null primary key)
–filegroup clause
set nocount on
declare @i int
set @i = 1
while (@i <= 1000000)
begin
insert into t
SELECT ABS(CAST(CAST(NEWID() AS VARBINARY(18))AS BIGINT))
from sysobjects
set @i = @i + 1
end
If you see above, we are looping 1000000 times and are generating the unique random records 1000000 * the number of records in the sysobjects table. This generated a very diverse range of unique random numbers. We took the NEWID() function and did a cast on it as a variable binary number which was then cast to a BIGINT data-type. The ABS() function was then used to take care of any negative values.
In DB2 also, there is a RAND() function which works on the same lines as SQL Server i.e. it takes in a seed value and returns back a float value between 0 and 1. One can use the RAND() function and then multiply it with a multiplier to get the random number generated for a specific number of digits.
If you need to generate random strings, then you can use the TRANSLATE() function to convert those random numbers to characters. We would first need to generate those random numbers for the specific digits and then convert that number into a string after which we can apply the TRANSLATE() function in order to convert it to random strings. Example:
BIGINT (RAND() * 10000000000) –> get the random 10 digit number
CHAR(BIGINT(RAND() * 10000000000)) –> Now convert that to a string
TRANSLATE (CHAR(BIGINT(RAND() * 10000000000 )), ‘ABCXYZEIOU’, ‘1234567890′) –> the final step of translating it to a random string
In one of the future posts, we will look at how one can sort in a random order (lottery systems or testing systems for example use it). The ground work has already been laid out in this post, it is a matter of just using those functions in the order by clause and selecting the TOP N records that are needed for display.