Welcome!

By registering with us, you'll be able to discuss, share and private message with other members of our community.

SignUp Now!
  • Guest, before posting your code please take these rules into consideration:
    • It is required to use our BBCode feature to display your code. While within the editor click < / > or >_ and place your code within the BB Code prompt. This helps others with finding a solution by making it easier to read and easier to copy.
    • You can also use markdown to share your code. When using markdown your code will be automatically converted to BBCode. For help with markdown check out the markdown guide.
    • Don't share a wall of code. All we want is the problem area, the code related to your issue.


    To learn more about how to use our BBCode feature, please click here.

    Thank you, Code Forum.

is "rand() % 100;" not very random?

null_reflections

Legendary Coder
I'm just wondering if what the folks on this stack is totally true, or maybe is outdated, and if it is true, then what they mean:


The thread is over 13 years old, but when I test that particular way to establish a range of numbers, the results do appear totally random. However, this is a pretty small sample, i'd suppose you'd need to get a thousands of numbers this way before you knew for sure.

First block is my code, second block is the output:

Code:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <unistd.h>

int main (){

int i;

for ( i=0; i<30; i++ ) {

        sleep(1);
        srand(time(NULL));
        int n = rand() % 100;
        printf ("%d\n", n);

}

}

Code:
32
18
7
15
72
34
88
7
65
34
80
82
61
67
36
5
78
31
82
37
16
77
79
9
25
90
21
80
36
27

Numbers in the 30's happened to appear several times in this, but that could easily just be chance.
 
Solution
Based on your output and considering everything between 20 and 40, around 33% of the numbers are inside them.
Now, 20 numbers are 20% of that hundred so they should appear in a range of that percentage as well.

But there is no reason to study more since total number of outputs is only from a 30 rounds. To get reasonable amount of random numbers, you should generate thousands of them.

Now, i Idont have a C-compiler on this computer so iIl use js instead.
JavaScript:
/*  With this code, each number between1 and 100
    should be rrandomizedaround 100 times. */
  
let count = {};

for (let i = 0; i < 10000; i++) {
  let num = Math.floor(Math.random() * 100) + 1;
  if (!count[num]) {
    count[num] = 1;
  } else {
    count[num]++;
  }
}...
I just cannot imagine that the C language, one of the post popular and powerful languages, has been stuck for 50 years with a random number generator that is so flawed as to "favor lower numbers". That would have been fixed long ago I would think ! But suppose it's true, then I'm sure there would have been a good reason for it, and it seems unlikely that you could simply remedy it by just biasing your choice to slightly favor higher numbers.
Being inquisitive is good ! As is a frank discussion 😁 But a 30-number sequence is just not nearly sufficient for any conclusions. Why don't you let it run indefinitely (until aborted) and instead of printing the numbers (which are not all that interesting to look at) print out the running average which is expected to converge to 0.5 ? At least that might prove or disprove the so-called favoring of lower numbers.
 
What comes to computers, there is not a truelly random numbers. Its tied to the excact moment when the number is produced or a seed passed to a function what produces these numbers.

On the other hand, throwing a dice does not produce a random number but a number that comes on top due fysical laws.



You must misunderstand. We did not say that, we said it may or may not produce them. It may also produce more high numbers than low numbers or numbers in the middle. But somekind of uniform patterns are most likely created in same time.

Having a output of 30 random numbers is not valid data in any means to do more research of how rand function works. It requires thousands, or even more rounds to create a random number to really see the output and be able to validate the data.
Ok, sorry, i truly regret posted any of that
 
I just cannot imagine that the C language, one of the post popular and powerful languages, has been stuck for 50 years with a random number generator that is so flawed as to "favor lower numbers". That would have been fixed long ago I would think ! But suppose it's true, then I'm sure there would have been a good reason for it, and it seems unlikely that you could simply remedy it by just biasing your choice to slightly favor higher numbers.
Being inquisitive is good ! As is a frank discussion 😁 But a 30-number sequence is just not nearly sufficient for any conclusions. Why don't you let it run indefinitely (until aborted) and instead of printing the numbers (which are not all that interesting to look at) print out the running average which is expected to converge to 0.5 ? At least that might prove or disprove the so-called favoring of lower numbers.
Oh i was think maybe i could have 2 random number generators, and the one could just generate values for and if statement, then it skews the value a certain way if it one of the numbers for the if statement is chosen, yesterday was a really bad day for some reason but i still think rand is interesting. I think it always prints the same thing if you dont seed with time.
 
Im curious to hear, what you believe to achieve with such a solution? After all, its not up how many different random number functions you create and call, the output follows certain rules and patterns will exists no matter what. If rand returns in first round 1 and in second 2, the numbers are just as random than if it would return 1 and 99.

If you want some spread and adjust code so that there is some min difference between first and second randomed number then you should loop while the difference is big enough for you.

Since im still out of C-compiler, here is idea in js.

JavaScript:
// store the latest random number in variable
let prevRandom = -1;  // negative as start value

function getRandomNumber() {
  let random = Math.floor(Math.random() * 101);
   
  // if difference between new and previous random number is more than 25
  if (prevRandom === -1 || Math.abs(random - prevRandom) >= 25) {
    prevRandom = random;
    return random;
  } else {
    // use recursion till difference is big enough
    // using a while loop instead is also good idea. What ever you like
    return getRandomNumber();
  }
}
 
Im curious to hear, what you believe to achieve with such a solution?
Oh, well basically just a reliable psuedo-random number generator that has small values. The small values of course aren't practical for making something like that, but it's just easier for a beginner to grasp, and also just to me "a cool idea". Grappling with that stack exchange question was kinda frustrating and hard, but i think it can easily be done with my algorith/idea. BUT FIRST, i need a larger sample to round of an estimate of how much it biases the numbers (a couple of times).
 
If you want some spread and adjust code so that there is some min difference between first and second randomed number then you should loop while the difference is big enough for you.
Hmmm. not sure I agree. Random is random, and the very essence of it is that it's unpredictable and should follow no rules. You have to accept that sometimes you get adjacent numbers or even the same number twice. Any attempt to "remedy" this invalidates the concept.
Anyway as you say, I think we can trust rand() to be correct and unbiased after half a century of field service 😁
 
Last edited by a moderator:
Well, if you want random numbers that has some spread, my example is valid. We dont know the needs of certain program someone is working at.
Anyway, i think this one is done so i dont care to talk more about this issue. Its gettin more like a philosophical issue instead of programming.
 
rand is reliable. You dont need to consider such a issue. There is patterns everywhere, do we want or not.
I actually left my computer on for a while and collected 25,000 lines in a document...it's funny, 1,000,000 seconds is actually a long time in terms of a human life if you watch the time fly. Kinda makes you think...i don't like leaving my computer on for huge periods of time.

Anyways, that is correct if you use a large sample like that (first one), the range was just 0 to 2, and 0 and 1 actually showed up slightly less than 2/3 of the time, which means the 2 was chosen more likely...this indicates that the machine is actually random enough, because when i was using smaller values, i was more likely getting the bottom half of the range like i showed earlier.

However, if you create the seed and rand again to have that second/different dice roll, then your pseudo-random becomes more random. I don't know if i'm going to do it or not, probably another time when i've gotten some sleep.
 
Im curious to hear, what you believe to achieve with such a solution? After all, its not up how many different random number functions you create and call, the output follows certain rules and patterns will exists no matter what. If rand returns in first round 1 and in second 2, the numbers are just as random than if it would return 1 and 99.

If you want some spread and adjust code so that there is some min difference between first and second randomed number then you should loop while the difference is big enough for you.

Since im still out of C-compiler, here is idea in js.

JavaScript:
// store the latest random number in variable
let prevRandom = -1;  // negative as start value

function getRandomNumber() {
  let random = Math.floor(Math.random() * 101);
  
  // if difference between new and previous random number is more than 25
  if (prevRandom === -1 || Math.abs(random - prevRandom) >= 25) {
    prevRandom = random;
    return random;
  } else {
    // use recursion till difference is big enough
    // using a while loop instead is also good idea. What ever you like
    return getRandomNumber();
  }
}
Here's what i talked about (using the double if statements) in C, it seems to still be biasing towards lower numbers, so i would need to a add a couple more conditions:
Code:
#include <stdlib.h>
#include <time.h>
#include <unistd.h>

int main (){

int i;

for ( i=0; i<1000; i++ ) {
        //assign
        int n;
        int l;
        int j;

        j = rand() % 2;

        sleep(1);

        n = rand() % 9;

        if (j==0 || j==1) {
                srand(time(NULL));
                printf ("%d\n", n);
        }
        else if (j==2 && n!=9) {
                srand(time(NULL));
                int l;
                l=n+1;  
                printf ("%d\n", l);
        }
        else {
                srand(time(NULL));
                printf ("%d\n", n);
        }


}

}
 
I don't feel good about the re-seeding. It's like you want the computer to give you random numbers but only in a way it suits you. Which is not quite the idea of random, IMO.
Anyway, for fun I programmed a loop of 5 million random numbers in the range 1-100 inclusive, using a single seed, and printed the running average:

C:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>

void main()
{
    unsigned long sum = 0;
    srand((unsigned)time(NULL));

    for (int n=1; n <= 5000000; n++)
    {
        int num = 1 + rand() % 100;
        sum += num;
        double avg = (double)sum / (double)n;
        printf("%20lu  %3d  %f\n", n, num, avg);
    }
}

It's fun to watch the running average. While it quickly nears the expected 50.5, it never quite wants to get there, instead very slowly seems to oscillate around it, eventually deciding to sit very slightly below it :

Code:
             ...
             4999980   86  50.449114
             4999981   75  50.449119
             4999982   91  50.449127
             4999983   16  50.449120
             4999984   49  50.449119
             4999985   32  50.449116
             4999986   10  50.449108
             4999987   23  50.449102
             4999988   42  50.449100
             4999989   86  50.449108
             4999990   62  50.449110
             4999991   96  50.449119
             4999992   88  50.449127
             4999993   57  50.449128
             4999994   11  50.449120
             4999995   63  50.449122
             4999996   10  50.449114
             4999997   58  50.449116
             4999998   68  50.449119
             4999999   38  50.449117
             5000000   63  50.449119

So perhaps there is some truth after all in people saying that rand() % 100 favors lower numbers. But if it does, it's only very slightly and nothing to worry about, IMO. I should try the similar test with out the modulo 100 to see if that makes a difference.

Yes @EkBass, this tends to get a bit philosophical. It's the nature of the beast, I think, as there isn't a real concise definition of random.
 
I don't feel good about the re-seeding. It's like you want the computer to give you random numbers but only in a way it suits you.
The purpose of having two rands() is to skew the bias away from it more frequently choosing the lower half of the numbers. Everytime someone writes their code, they are doing it in a way that suits them or their clients/employer, correct? I think your code is interesting and I'll scrutinize it more closeley at one point or another. My next task is I'm going to see what my output looks like when the first rand has more options, I've kinda structured it that way kinda like gears on a bike.
 
Well, I effectively skewed the results more to the upper half of the numbers, but now it just does the same thing in reverse:

Code:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <unistd.h>

int main (){

int i;

for ( i=0; i<1000; i++ ) {
        //assign
        int n;
        int l;
        int j;

        j = rand() % 5;

        sleep(1);

        n = rand() % 9;

        if (j==4 || j==5) {
                srand(time(NULL));
                printf ("%d\n", n);
        }
        else if (j==0 || j==1 || j==2 || j==3 && n!=9) {
                srand(time(NULL));
                int l;
                l=n+1;
                printf ("%d\n", l);
        }
        else {
                srand(time(NULL));
                printf ("%d\n", n);
        }


}

}

I could keep messing with it but that's kinda boring to me at this point. I'd basically just have to keep adjusting the ratio, and at that point I might as well just use larger numbers. You could also assign the smaller numbers to specific ranges within a more astronomical number, but I'm just not interested currently. There's a lot to learn in C.
 
Well, i thought to do some research too with much bigger date.

JavaScript:
/* randoms a number between 1 and 100 and counts the number of times each number is generated */
function randomArray() {
  let count = Array.from({length: 100}, () => 0);
 
 // lets do few million rounds //
  for (let i = 0; i < 5000000; i++) {
    let randomIndex = Math.floor(Math.random() * 100);
    count[randomIndex]++;
  }
 
  console.log("Array index counts:");
  for (let i = 0; i < 100; i++) {
    console.log(`${i + 1}: ${count[i]}`);
  }
}

randomArray();

I dont want to paste several outputs here, but in general atleast with my computer and javascript run with node it was pretty even. I calculated how many numbers were 50 or under it and how many numbers were 50 or over it.

There were no bias. In some times number 50 or over had some more occurances and sometimes 50 and under it had more occurances. Mostly the average was somewhere between 48 and 52. That is what you should expect in this amount of rounds. Naturally, in smaller rounds, something like 30 or maybe few hundreds the average naturally can varieate much more.

I am not able to say, which is good number for rounds to get the average. Naturally maybe the first few thousands rounds has much more affect to the average than rounds somewhere close to million. If at start is randomed only a low numbers, it affects to the average and it takes quite a while to get the average up in later rounds.

I would not say, that some function in some language is biased since the result is allways up to the time function is called or the seed it has been provided. Today smaller numbers may occur and some other day a higher ones can occur. As told, there is allways patterns and if for some reason there is not, human brains may still find them if they want to.
 
There were no bias. In some times number 50 or over had some more occurances and sometimes 50 and under it had more occurances. Mostly the average was somewhere between 48 and 52. That is what you should expect in this amount of rounds. Naturally, in smaller rounds, something like 30 or maybe few hundreds the average naturally can varieate much more.

I am not able to say, which is good number for rounds to get the average. Naturally maybe the first few thousands rounds has much more affect to the average than rounds somewhere close to million. If at start is randomed only a low numbers, it affects to the average and it takes quite a while to get the average up in later rounds.
I haven't tested your code, but that is the best way to try to make a random number generator.
 
It's quite fun to play around with this, so I did the plotting experiment I described earlier. The result looks pretty unbiased no matter how many times I refresh it :
a.jpg

This was written in C#, generating 400000 random numbers in the range [0,imageWidth] inclusive. No seeding here, C# Random does that itself by default, and the sequence is different every time, so that when I refresh I get a different, but equally uniform, pattern.
Of course we don't know if the C# random generator is different from the one in C#. It probably is, as you can also generate float and double numbers, in any range specified. It seems clear there is no bias towards lower numbers here, or we would consistently see a higher concentration of black in the upper left quadrant. The code executed with each refresh is

C#:
private void pictureBox1_Paint(object sender, PaintEventArgs e)
{
    Graphics g = e.Graphics;
    var rand = new Random();
    var n =  Int32.Parse(nPoints.Text);            

    for ( int i=0; i<n; i++ )
    {
        var x = rand.Next(0, pictureBox1.Width);
        var y = rand.Next(0, pictureBox1.Height);
        g.FillRectangle(Brushes.Black, x, y, 1, 1);
    }
}
 
Last edited by a moderator:
This is not sure information, but i think random numbers are in most languages are as float or double. Depending of function from where its returned, its often converted as integer due atleast in history, integers were what most of programmers asked for.
 
It's quite fun to play around with this, so I did the plotting experiment I described earlier. The result looks pretty unbiased no matter how many times I refresh it :
View attachment 2014

This was written in C#, generating 400000 random numbers in the range [0,imageWidth] inclusive. No seeding here, C# Random does that itself by default, and the sequence is different every time, so that when I refresh I get a different, but equally uniform, pattern.
Of course we don't know if the C# random generator is different from the one in C#. It probably is, as you can also generate float and double numbers, in any range specified. It seems clear there is no bias towards lower numbers here, or we would consistently see a higher concentration of black in the upper left quadrant. The code executed with each refresh is

C#:
private void pictureBox1_Paint(object sender, PaintEventArgs e)
{
    Graphics g = e.Graphics;
    var rand = new Random();
    var n =  Int32.Parse(nPoints.Text);           

    for ( int i=0; i<n; i++ )
    {
        var x = rand.Next(0, pictureBox1.Width);
        var y = rand.Next(0, pictureBox1.Height);
        g.FillRectangle(Brushes.Black, x, y, 1, 1);
    }
}
So that reveals that if you used C to print the small numbers, but mapped them to big numbers (very easy with if/else and a little bit of division), then that takes care of the fact that rand() tends to be biased towards the small numbers.
 
Well, i thought to do some research too with much bigger date.

JavaScript:
/* randoms a number between 1 and 100 and counts the number of times each number is generated */
function randomArray() {
  let count = Array.from({length: 100}, () => 0);
 
 // lets do few million rounds //
  for (let i = 0; i < 5000000; i++) {
    let randomIndex = Math.floor(Math.random() * 100);
    count[randomIndex]++;
  }
 
  console.log("Array index counts:");
  for (let i = 0; i < 100; i++) {
    console.log(`${i + 1}: ${count[i]}`);
  }
}

randomArray();

I dont want to paste several outputs here, but in general atleast with my computer and javascript run with node it was pretty even. I calculated how many numbers were 50 or under it and how many numbers were 50 or over it.

There were no bias. In some times number 50 or over had some more occurances and sometimes 50 and under it had more occurances. Mostly the average was somewhere between 48 and 52. That is what you should expect in this amount of rounds. Naturally, in smaller rounds, something like 30 or maybe few hundreds the average naturally can varieate much more.

I am not able to say, which is good number for rounds to get the average. Naturally maybe the first few thousands rounds has much more affect to the average than rounds somewhere close to million. If at start is randomed only a low numbers, it affects to the average and it takes quite a while to get the average up in later rounds.

I would not say, that some function in some language is biased since the result is allways up to the time function is called or the seed it has been provided. Today smaller numbers may occur and some other day a higher ones can occur. As told, there is allways patterns and if for some reason there is not, human brains may still find them if they want to.

That's a neat program, thank you for posting it here and helping me figure out how to run it correctly.
 
So that reveals that if you used C to print the small numbers, but mapped them to big numbers (very easy with if/else and a little bit of division), then that takes care of the fact that rand() tends to be biased towards the small numbers.
Does it ? IMO all this experiment reveals is that, at least in C# on Windows with 64-bits Intel architecture, the random function is not biased towards any values. Or if it is, it is not perceptible with the naked eye. But ok, I did not use the % 100, but rather asked the function to generate numbers in the specified range (the size of my canvas). Perhaps it matters, although I somehow doubt it. I would try the same experiment in C but I don't do any graphics programming in C.
 

New Threads

Latest posts

Buy us a coffee!

Back
Top Bottom