Welcome!

By registering with us, you'll be able to discuss, share and private message with other members of our community.

SignUp Now!
  • Guest, before posting your code please take these rules into consideration:
    • It is required to use our BBCode feature to display your code. While within the editor click < / > or >_ and place your code within the BB Code prompt. This helps others with finding a solution by making it easier to read and easier to copy.
    • You can also use markdown to share your code. When using markdown your code will be automatically converted to BBCode. For help with markdown check out the markdown guide.
    • Don't share a wall of code. All we want is the problem area, the code related to your issue.


    To learn more about how to use our BBCode feature, please click here.

    Thank you, Code Forum.

is "rand() % 100;" not very random?

null_reflections

Legendary Coder
I'm just wondering if what the folks on this stack is totally true, or maybe is outdated, and if it is true, then what they mean:


The thread is over 13 years old, but when I test that particular way to establish a range of numbers, the results do appear totally random. However, this is a pretty small sample, i'd suppose you'd need to get a thousands of numbers this way before you knew for sure.

First block is my code, second block is the output:

Code:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <unistd.h>

int main (){

int i;

for ( i=0; i<30; i++ ) {

        sleep(1);
        srand(time(NULL));
        int n = rand() % 100;
        printf ("%d\n", n);

}

}

Code:
32
18
7
15
72
34
88
7
65
34
80
82
61
67
36
5
78
31
82
37
16
77
79
9
25
90
21
80
36
27

Numbers in the 30's happened to appear several times in this, but that could easily just be chance.
 
Solution
Based on your output and considering everything between 20 and 40, around 33% of the numbers are inside them.
Now, 20 numbers are 20% of that hundred so they should appear in a range of that percentage as well.

But there is no reason to study more since total number of outputs is only from a 30 rounds. To get reasonable amount of random numbers, you should generate thousands of them.

Now, i Idont have a C-compiler on this computer so iIl use js instead.
JavaScript:
/*  With this code, each number between1 and 100
    should be rrandomizedaround 100 times. */
  
let count = {};

for (let i = 0; i < 10000; i++) {
  let num = Math.floor(Math.random() * 100) + 1;
  if (!count[num]) {
    count[num] = 1;
  } else {
    count[num]++;
  }
}...
Does it ? IMO all this experiment reveals is that, at least in C# on Windows with 64-bits Intel architecture, the random function is not biased towards any values. Or if it is, it is not perceptible with the naked eye. But ok, I did not use the % 100, but rather asked the function to generate numbers in the specified range (the size of my canvas). Perhaps it matters, although I somehow doubt it. I would try the same experiment in C but I don't do any graphics programming in C.
I think it would have to be based on values and it would be interesting to know more.
 
Yes it would. You'll need to get hold of the rand() source code for your particular compiler to learn the gory details. There seem to be quite a number of (slightly) different/improved versions. What seems to be the historical implementation looks laughably simple (which is not to say I understand it 😁 ):
C:
return (*ctx = *ctx * 1103515245 + 12345) % ((u_long)RAND_MAX + 1));
The next random number is derived from the previous one (or the seed, if we're at the beginning) by this simple formula. It's probably true what they say that this is not a great random generator. Other implementations look more complex, some even using processor features.

I found this quote somewhere which I suspect is right on the ball:
If you really need to understand and implement a random function you should probably read Knuth’s Seminumerical Algorithms first. It will set you straight on how these things really work.
 
Yes it would. You'll need to get hold of the rand() source code for your particular compiler to learn the gory details. There seem to be quite a number of (slightly) different/improved versions. What seems to be the historical implementation looks laughably simple (which is not to say I understand it 😁 ):
C:
return (*ctx = *ctx * 1103515245 + 12345) % ((u_long)RAND_MAX + 1));
The next random number is derived from the previous one (or the seed, if we're at the beginning) by this simple formula. It's probably true what they say that this is not a great random generator. Other implementations look more complex, some even using processor features.

I found this quote somewhere which I suspect is right on the ball:
I don't know, maybe we are over-complicating the issue. I've noticed through doing this as well, that to a human, you always start counting at 1. If you are a computer, you always start counting at 0. To a human...0 basically means "non-existent", whereas a robot/computer couldn't possibly conceive of human zero outside of a mathematical equation, so there is maybe a counting bias going on with this. For example...you can divide 10 by 2 to get 5, but then whatever programmer is creating the rand logic, maybe they have to bias towards 0 for some reason. I don't know, I'm honestly still confused. Maybe the whole subject of this discussion, rand being biased towards low numbers, is somehow based on human error.

However, I did turn my earlier idea into another RNG, this is based on a quadrillion possibilities, but it only outputs 1 through 10:
Code:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>

int main() {
                                                               
srand(time(NULL));                
long double n = rand() % 999999999999999;


if (n >= 0 && 99999 >= n) {
        puts("1");
}
else if ( n >= 100000 && 999999 >= n) {
        puts("2");
}
else if ( n >= 1000000 && 9999999 >= n) {
        puts("3");
}
else if ( n >= 10000000 && 99999999 >= n) {
        puts("4");
}
else if ( n >= 100000000 && 999999999 >= n) {
        puts("5");
}
else if ( n >= 1000000000 && 9999999999 >= n) {
        puts("6");
}
else if ( n >= 10000000000 && 99999999999 >= n) {
        puts("7");
}
else if ( n >= 100000000000 && 999999999999 >= n) {
        puts("8");
}
else if ( n >= 1000000000000 && 99999999999999 >= n) {
        puts("9");
}
else {
        puts("10");
}

}

I could make another loop, but I feel that if i keep thinking of this for now, then i'm going to short circuit my brain...
 
Hell no, it's nothing to do with counting from 0 or zero. Computers and programmers alike count from zero and have always done.
I only used 1 in my earlier C example because 1..100 somehow looks better than 0..99.
What your above code is all about I have no clue really.... Does you rand() return double values that big ? Did you check the value of RAND_MAX ?

If I were you I'd leave it... but that's just me being lazy.
If you really want to get to the bottom of this, look up the source code. If anything is unclear in the code, go read Knuth's book. It may take a while, but you'll be an expert in the end 😁
 
Hell no, it's nothing to do with counting from 0 or zero. Computers and programmers alike count from zero and have always done.
I only used 1 in my earlier C example because 1..100 somehow looks better than 0..99.
What your above code is all about I have no clue really.... Does you rand() return double values that big ? Did you check the value of RAND_MAX ?

If I were you I'd leave it... but that's just me being lazy.
If you really want to get to the bottom of this, look up the source code. If anything is unclear in the code, go read Knuth's book. It may take a while, but you'll be an expert in the end 😁
*Sigh*, nobody needs to read books in order to understand things, even though books are pretty fun to read and I might buy it :)

The code above is choosing from a pool of a quadrillion numbers with the long double data type...from 0 to...

*takes deep breath*

nine-hundred and ninety-nine trillion, nine-hundred and ninety-nine billion, nine-hundred and ninety-nine million, nine-hundred and ninety-nine thousands, nine-hundred and ninety-nine. THEN, it assigns to 1-10 based on the ten-divisible range of that number. It was
fun but dealing with gigantic numbers like that...especially when you can't use commas...is annoying.

No, I haven't learned how to check the value of RAND_MAX yet, google says this about it:

Code:
The constant RAND_MAX is the maximum value that can be returned by the rand function. RAND_MAX is defined as the value 0x7fff
 
Okay, so mapping 10 numbers between a quadrillion possibilities divided by 10 probably did work, i can already see that it's not biased towards low numbers anymore. Doing more testing would be tedious, I would like to actually try to figure out what the other suggestions do now...

Code:
sed -n '/[1-5]/p' test1 | wc -l
46

sed -n '/[1-5]/p' test2 | wc -l
53

sed -n '/[1-5]/p' test3 | wc -l
44

However, it seems to be overwhelmly choosing numbers between 4 and 6, what's up with that? Just a statistical anomaly? Yeah I don't think rand seeded with time is a great RNG...

Code:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <unistd.h>

int main() {

int x;

for(x = 0; x < 100; x++){

                srand(time(NULL));
                long double n = rand() % 999999999999999;

                sleep(1);

                if (n >= 0 && 99999 >= n) {
                                puts("1");
                }

                else if ( n >= 100000 && 999999 >= n) {
                        puts("2");
                }

                else if ( n >= 1000000 && 9999999 >= n) {
                        puts("3");
                }
                else if ( n >= 10000000 && 99999999 >= n) {
                        puts("4");
                }
                else if ( n >= 100000000 && 999999999 >= n) {
                        puts("5");
                }
                else if ( n >= 1000000000 && 9999999999 >= n) {
                        puts("6");
                }
                else if ( n >= 10000000000 && 99999999999 >= n) {
                        puts("7");
                }
                else if ( n >= 100000000000 && 999999999999 >= n) {
                        puts("8");
                }
                else if ( n >= 1000000000000 && 99999999999999 >= n) {
                        puts("9");
                }
                else {
                        puts("10");

}

}

}
 
Yes it would. You'll need to get hold of the rand() source code for your particular compiler to learn the gory details. There seem to be quite a number of (slightly) different/improved versions. What seems to be the historical implementation looks laughably simple (which is not to say I understand it 😁 ):
C:
return (*ctx = *ctx * 1103515245 + 12345) % ((u_long)RAND_MAX + 1));
The next random number is derived from the previous one (or the seed, if we're at the beginning) by this simple formula. It's probably true what they say that this is not a great random generator. Other implementations look more complex, some even using processor features.

I found this quote somewhere which I suspect is right on the ball:
Donald Knuth's series of books is kinda old, from 1968, and that book you referenced is no. 2 in a series of 7 volumes. Seeing this obviously honest/knowledgeable review, i think there are probably much better options out there, even if they are hard to find:

Code:
The definitive work on programming; without a doubt there is no more important book on Computer
Science. However, it's almost totally impenetrable. I haven't read even a quarter of this,
and fully understood much less, but that's nothing to be ashamed of,
as probably no one else has either. All the examples are in a made up assembly language,
and Knuth invented his own typesetting system to publish it, which became widespread and famous.
 
Sure you can do lots without reading a book these days. Understanding an algorithm and its intricacies is another matter though. I'd guess the algorithm of rand() may still be based on Knuth's ideas from 1968. They may be old now, but they weren't in 1969 when UNIX development started in 1969.

Checking RAND_MAX is as easy as this

C:
printf("RAND_MAX = %lu\n", RAND_MAX);

and you should definitely do that, then think about whether it really makes sense to compute the result % 999999999999999. Also, check if rand() really returns a double as you seem to assume.
 
I was just assuming rand() might return a long integer, is why I used %lu. But I don't know your environment. Have a look in <stdlib.h> to see the return type of rand() and the value of RAND_MAX. It's about time to get some facts rather than keep guessing.
 

New Threads

Latest posts

Buy us a coffee!

Back
Top Bottom