# is "rand() % 100;" not very random?

#### null_reflections

##### Legendary Coder
I'm just wondering if what the folks on this stack is totally true, or maybe is outdated, and if it is true, then what they mean:

The thread is over 13 years old, but when I test that particular way to establish a range of numbers, the results do appear totally random. However, this is a pretty small sample, i'd suppose you'd need to get a thousands of numbers this way before you knew for sure.

First block is my code, second block is the output:

Code:
``````#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <unistd.h>

int main (){

int i;

for ( i=0; i<30; i++ ) {

sleep(1);
srand(time(NULL));
int n = rand() % 100;
printf ("%d\n", n);

}

}``````

Code:
``````32
18
7
15
72
34
88
7
65
34
80
82
61
67
36
5
78
31
82
37
16
77
79
9
25
90
21
80
36
27``````

Numbers in the 30's happened to appear several times in this, but that could easily just be chance.

Solution
Based on your output and considering everything between 20 and 40, around 33% of the numbers are inside them.
Now, 20 numbers are 20% of that hundred so they should appear in a range of that percentage as well.

But there is no reason to study more since total number of outputs is only from a 30 rounds. To get reasonable amount of random numbers, you should generate thousands of them.

Now, i Idont have a C-compiler on this computer so iIl use js instead.
JavaScript:
``````/*  With this code, each number between1 and 100
should be rrandomizedaround 100 times. */

let count = {};

for (let i = 0; i < 10000; i++) {
let num = Math.floor(Math.random() * 100) + 1;
if (!count[num]) {
count[num] = 1;
} else {
count[num]++;
}
}...``````

#### null_reflections

##### Legendary Coder
Does it ? IMO all this experiment reveals is that, at least in C# on Windows with 64-bits Intel architecture, the random function is not biased towards any values. Or if it is, it is not perceptible with the naked eye. But ok, I did not use the `% 100`, but rather asked the function to generate numbers in the specified range (the size of my canvas). Perhaps it matters, although I somehow doubt it. I would try the same experiment in C but I don't do any graphics programming in C.
I think it would have to be based on values and it would be interesting to know more.

#### cbreemer

##### King Coder
Yes it would. You'll need to get hold of the `rand()` source code for your particular compiler to learn the gory details. There seem to be quite a number of (slightly) different/improved versions. What seems to be the historical implementation looks laughably simple (which is not to say I understand it 😁 ):
C:
``return (*ctx = *ctx * 1103515245 + 12345) % ((u_long)RAND_MAX + 1));``
The next random number is derived from the previous one (or the seed, if we're at the beginning) by this simple formula. It's probably true what they say that this is not a great random generator. Other implementations look more complex, some even using processor features.

I found this quote somewhere which I suspect is right on the ball:
If you really need to understand and implement a random function you should probably read Knuth’s Seminumerical Algorithms first. It will set you straight on how these things really work.

#### null_reflections

##### Legendary Coder
Yes it would. You'll need to get hold of the `rand()` source code for your particular compiler to learn the gory details. There seem to be quite a number of (slightly) different/improved versions. What seems to be the historical implementation looks laughably simple (which is not to say I understand it 😁 ):
C:
``return (*ctx = *ctx * 1103515245 + 12345) % ((u_long)RAND_MAX + 1));``
The next random number is derived from the previous one (or the seed, if we're at the beginning) by this simple formula. It's probably true what they say that this is not a great random generator. Other implementations look more complex, some even using processor features.

I found this quote somewhere which I suspect is right on the ball:
I don't know, maybe we are over-complicating the issue. I've noticed through doing this as well, that to a human, you always start counting at 1. If you are a computer, you always start counting at 0. To a human...0 basically means "non-existent", whereas a robot/computer couldn't possibly conceive of human zero outside of a mathematical equation, so there is maybe a counting bias going on with this. For example...you can divide 10 by 2 to get 5, but then whatever programmer is creating the rand logic, maybe they have to bias towards 0 for some reason. I don't know, I'm honestly still confused. Maybe the whole subject of this discussion, rand being biased towards low numbers, is somehow based on human error.

However, I did turn my earlier idea into another RNG, this is based on a quadrillion possibilities, but it only outputs 1 through 10:
Code:
``````#include <stdio.h>
#include <stdlib.h>
#include <time.h>

int main() {

srand(time(NULL));
long double n = rand() % 999999999999999;

if (n >= 0 && 99999 >= n) {
puts("1");
}
else if ( n >= 100000 && 999999 >= n) {
puts("2");
}
else if ( n >= 1000000 && 9999999 >= n) {
puts("3");
}
else if ( n >= 10000000 && 99999999 >= n) {
puts("4");
}
else if ( n >= 100000000 && 999999999 >= n) {
puts("5");
}
else if ( n >= 1000000000 && 9999999999 >= n) {
puts("6");
}
else if ( n >= 10000000000 && 99999999999 >= n) {
puts("7");
}
else if ( n >= 100000000000 && 999999999999 >= n) {
puts("8");
}
else if ( n >= 1000000000000 && 99999999999999 >= n) {
puts("9");
}
else {
puts("10");
}

}``````

I could make another loop, but I feel that if i keep thinking of this for now, then i'm going to short circuit my brain...

#### cbreemer

##### King Coder
Hell no, it's nothing to do with counting from 0 or zero. Computers and programmers alike count from zero and have always done.
I only used 1 in my earlier C example because `1..100` somehow looks better than `0..99`.
What your above code is all about I have no clue really.... Does you `rand()` return double values that big ? Did you check the value of RAND_MAX ?

If I were you I'd leave it... but that's just me being lazy.
If you really want to get to the bottom of this, look up the source code. If anything is unclear in the code, go read Knuth's book. It may take a while, but you'll be an expert in the end 😁

#### null_reflections

##### Legendary Coder
Hell no, it's nothing to do with counting from 0 or zero. Computers and programmers alike count from zero and have always done.
I only used 1 in my earlier C example because `1..100` somehow looks better than `0..99`.
What your above code is all about I have no clue really.... Does you `rand()` return double values that big ? Did you check the value of RAND_MAX ?

If I were you I'd leave it... but that's just me being lazy.
If you really want to get to the bottom of this, look up the source code. If anything is unclear in the code, go read Knuth's book. It may take a while, but you'll be an expert in the end 😁
*Sigh*, nobody needs to read books in order to understand things, even though books are pretty fun to read and I might buy it

The code above is choosing from a pool of a quadrillion numbers with the long double data type...from 0 to...

*takes deep breath*

nine-hundred and ninety-nine trillion, nine-hundred and ninety-nine billion, nine-hundred and ninety-nine million, nine-hundred and ninety-nine thousands, nine-hundred and ninety-nine. THEN, it assigns to 1-10 based on the ten-divisible range of that number. It was
fun but dealing with gigantic numbers like that...especially when you can't use commas...is annoying.

No, I haven't learned how to check the value of RAND_MAX yet, google says this about it:

Code:
``The constant RAND_MAX is the maximum value that can be returned by the rand function. RAND_MAX is defined as the value 0x7fff``

#### null_reflections

##### Legendary Coder
Okay, so mapping 10 numbers between a quadrillion possibilities divided by 10 probably did work, i can already see that it's not biased towards low numbers anymore. Doing more testing would be tedious, I would like to actually try to figure out what the other suggestions do now...

Code:
``````sed -n '/[1-5]/p' test1 | wc -l
46

sed -n '/[1-5]/p' test2 | wc -l
53

sed -n '/[1-5]/p' test3 | wc -l
44``````

However, it seems to be overwhelmly choosing numbers between 4 and 6, what's up with that? Just a statistical anomaly? Yeah I don't think rand seeded with time is a great RNG...

Code:
``````#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <unistd.h>

int main() {

int x;

for(x = 0; x < 100; x++){

srand(time(NULL));
long double n = rand() % 999999999999999;

sleep(1);

if (n >= 0 && 99999 >= n) {
puts("1");
}

else if ( n >= 100000 && 999999 >= n) {
puts("2");
}

else if ( n >= 1000000 && 9999999 >= n) {
puts("3");
}
else if ( n >= 10000000 && 99999999 >= n) {
puts("4");
}
else if ( n >= 100000000 && 999999999 >= n) {
puts("5");
}
else if ( n >= 1000000000 && 9999999999 >= n) {
puts("6");
}
else if ( n >= 10000000000 && 99999999999 >= n) {
puts("7");
}
else if ( n >= 100000000000 && 999999999999 >= n) {
puts("8");
}
else if ( n >= 1000000000000 && 99999999999999 >= n) {
puts("9");
}
else {
puts("10");

}

}

}``````

#### null_reflections

##### Legendary Coder
Yes it would. You'll need to get hold of the `rand()` source code for your particular compiler to learn the gory details. There seem to be quite a number of (slightly) different/improved versions. What seems to be the historical implementation looks laughably simple (which is not to say I understand it 😁 ):
C:
``return (*ctx = *ctx * 1103515245 + 12345) % ((u_long)RAND_MAX + 1));``
The next random number is derived from the previous one (or the seed, if we're at the beginning) by this simple formula. It's probably true what they say that this is not a great random generator. Other implementations look more complex, some even using processor features.

I found this quote somewhere which I suspect is right on the ball:
Donald Knuth's series of books is kinda old, from 1968, and that book you referenced is no. 2 in a series of 7 volumes. Seeing this obviously honest/knowledgeable review, i think there are probably much better options out there, even if they are hard to find:

Code:
``````The definitive work on programming; without a doubt there is no more important book on Computer
Science. However, it's almost totally impenetrable. I haven't read even a quarter of this,
and fully understood much less, but that's nothing to be ashamed of,
as probably no one else has either. All the examples are in a made up assembly language,
and Knuth invented his own typesetting system to publish it, which became widespread and famous.``````

#### cbreemer

##### King Coder
Sure you can do lots without reading a book these days. Understanding an algorithm and its intricacies is another matter though. I'd guess the algorithm of `rand()` may still be based on Knuth's ideas from 1968. They may be old now, but they weren't in 1969 when UNIX development started in 1969.

Checking `RAND_MAX` is as easy as this

C:
``printf("RAND_MAX = %lu\n", RAND_MAX);``

and you should definitely do that, then think about whether it really makes sense to compute the result `% 999999999999999`. Also, check if `rand()` really returns a `double` as you seem to assume.

#### null_reflections

##### Legendary Coder
Checking `RAND_MAX` is as easy as this
Still not very easy, i think knowing what %lu was as a data type would be just as important as knowing the numeric value of a function.

And my code that i posted before is bad, i even labeled it as such. Yet, i will surely be on to something when i know WHY its bad.

#### cbreemer

##### King Coder
I was just assuming `rand()` might return a long integer, is why I used `%lu`. But I don't know your environment. Have a look in `<stdlib.h>` to see the return type of `rand()` and the value of `RAND_MAX`. It's about time to get some facts rather than keep guessing.

#### null_reflections

##### Legendary Coder
But I don't know your environment.
It's a very standard linux environment as I insinuated earlier in the thread.

#### cbreemer

##### King Coder
Compiler brand, version, and bitness, and perhaps the processor architecture, are the things that matter, not the linux environment.