Probability #1

tutorial

I started re-reading The Drunkard’s Walk again today.

I read it a few years ago and remember really liking it. There are lots of examples in there of situations that seem to defy logic, or at least defy our sense of what is logical. But these are provable mathematically using the basic probability.

Whenever I come across a problem like this, even once I get my head around it as much as I can, I like to write some code to prove it out.

One of my favorite such problems is the two children problem. I haven’t gotten to it in the book yet, but I know it’s coming up. It’s a classic. Here it is, my paraphrase:

A woman says she has two children. She says one of them is a boy. What are the odds that the other one is a boy?

The obvious answer is 50%. There’s a child you don’t know about. It’s either a girl or a boy. Everything else is irrelevant, right?

Nope. Actually the odds are 1 in 3 that the other child a boy, 2 in 3 that it’s a girl.

And the really odd part of it is you can change the wording in a way that seems to make no difference, but totally changes it:

A woman says she has two children. She says the first born one is a boy. What are the odds that the other one is a boy?

In this case, yes, the odds are 50% that the other child is a boy.

To understand the odds of a particular situation occurring, such as the genders of two children, you have to consider all possible arrangements and then how many of those arrangements satisfy the criteria. Then divide.

In this case, we have a family who had one child, then another child. They may have had a boy first, then a girl. Or maybe a boy, then another boy. Or a girl and then a boy. Or a girl and another girl. That’s four possibilities:

  • boy boy
  • boy girl
  • girl boy
  • girl girl

So in the first problem, the mother says that one of her children is a boy. This narrows us down to just three possibilities.

  • boy boy
  • boy girl
  • girl boy

In all three of those, one of the children is a boy. In two of them the other child is a girl and in only one, the other child is also a boy. So, 1 in 3 for boy, 2 in 3 for girl.

But let’s look at the second problem. Mom says that her first child is a boy. That gives us only two possibilities:

  • boy boy
  • boy girl

In one of those, the other child is a girl and in one, it’s a boy. 50/50.

Prove it with Code

I’m doing this with JavaScript using node.js. But use whatever you want.

For probability situations, it helps to have a large number of samples. So let’s make 1000 families. Each family will have two children. We can represent these by strings: “bb” means they had a boy, then another boy. “gb” means they had a girl, then a boy, in that order. Likewise, “bg” means the opposite order and “gg” means they had two girls. We’ll store all the families in an array, and we’ll actually go through one by one, randomly choosing the gender of the first child, and then the second child.

let families = [];

// create 1000 families with two randomly gendered children.
for (let i = 0; i < 1000; i++) {
  // no kids yet.
  let family = "";

  // first child
  if (Math.random() < 0.5) {
    family += "b";
  } else {
    family += "g";
  }

  // second child
  if (Math.random() < 0.5) {
    family += "b";
  } else {
    family += "g";
  }

  families.push(family);
}
console.log(families);

This should give you something like the following output, with 1000 total strings.

['gb', 'gb', 'gg', 'gb', 'gb', 'gg', 'bb', 'bg', 'bb', 'gb',
'bg', 'gg', 'gg', 'bg', 'bg', 'gg', 'gg', 'bg', 'gb', 'gg',
'gg', 'gb', 'bb', 'bg', 'gg', 'gb', 'gg', 'gb', 'bb', 'gb',
'bg', 'bb', 'gg', 'gb', 'gb', 'bb', 'bg', 'bg', 'gb', 'bb',
'gb', 'gb', 'gg', 'gg', 'bb', 'bb', 'bb', 'gb', 'gg', 'gb',
'gb', 'bg', 'gg', 'bg', 'bg', 'gb', 'bg', 'gg', 'gg', 'bg',
'bb', 'gb', 'bg', 'gb', 'gg', 'gg', 'bg', 'bg', 'gg', 'bb',
'gb', 'bg', 'gg', 'gb', 'bg', 'bg', 'bg', 'gg', 'bb', 'gb',
'gg', 'bb', 'bb', 'gg', 'gg', 'bb', 'gg', 'gg', 'bg', 'bb',
'bb', 'gg', 'gg', 'gg', 'gg', 'bg', 'gg', 'gg', 'bg', 'gg',
...
]

We have multiples of every type of family in there: “bb”, “bg”, “gg”, “gb”. Now, the mother said that one of the children was a boy. So let’s filter this down to only the families that have a “b” in them.

// now lets get all the families who have at least one boy
let oneBoyFamilies = families.filter(family => family.indexOf("b") > -1);
console.log(oneBoyFamilies);

My var name isn’t the greatest. Consider oneBoyFamilies to mean “at least one boy”. This should give you something like the following output.

['gb', 'gb', 'gb', 'gb', 'bb', 'bg', 'bb', 'gb', 'bg', 'bg',
'bg', 'bg', 'gb', 'gb', 'bb', 'bg', 'gb', 'gb', 'bb', 'gb',
'bg', 'bb', 'gb', 'gb', 'bb', 'bg', 'bg', 'gb', 'bb', 'gb',
'gb', 'bb', 'bb', 'bb', 'gb', 'gb', 'gb', 'bg', 'bg', 'bg',
'gb', 'bg', 'bg', 'bb', 'gb', 'bg', 'gb', 'bg', 'bg', 'bb',
'gb', 'bg', 'gb', 'bg', 'bg', 'bg', 'bb', 'gb', 'bb', 'bb',
'bb', 'bg', 'bb', 'bb', 'bg', 'bg', 'bg', 'bb', 'bb', 'bb',
'gb', 'bb', 'bb', 'bb', 'bg', 'bg', 'bg', 'gb', 'bb', 'bb',
'gb', 'bb', 'bg', 'bg', 'bb',
...
]

I don’t see any “gg”s in there, but just to be sure, we can say:

// validate that there are no families with two girls here
console.log(oneBoyFamilies.indexOf("gg"));

If we’ve done that right, we should get a -1 here. I do, so I’m satisfied.

Now, to find families where the other child is a boy, we need to look for families consisting of “bb”. But for families where the other child is a girl, we need to look for either “gb” or “bg”. Already you can see why it’s 2 to 1 in favor of the other child being a girl. But here’s the code:

let otherChildIsABoy = oneBoyFamilies.filter(family => family === "bb");
let otherChildIsAGirl = oneBoyFamilies.filter(family => family === "bg" || family == "gb");
console.log("boy: " + otherChildIsABoy.length);
console.log("girl: " + otherChildIsAGirl.length);

Alternately, for the girls, you could do something like:

let otherChildIsAGirl = oneBoyFamilies.filter(family => family.indexOf("g") > -1);

Shouldn’t make any difference. With either method, I consistently get in the mid-200s for boys and around 500 for girls. Total would be around 750-ish, which makes sense since we filtered out one of the four arrangements (“gg”). So, 1 in 3 for boys, 2 in 3 for girls. Spot on.

Finally, let’s do the second version. Here, the mom says her first child is a boy. So for that we have to search for families where the first character is a “b”. Then we just get the count of the resulting families where the second character is a “b” and the count where it’s “g”.

// this time, lets get all the families where the FIRST child is a boy
let firstBoyFamilies = families.filter(family => family.charAt(0) === "b");
console.log(firstBoyFamilies);

let secondChildIsABoy = firstBoyFamilies.filter(family => family.charAt(1) === "b");
let secondChildIsAGirl = firstBoyFamilies.filter(family => family.charAt(1) === "g");

console.log("boy: " + secondChildIsABoy.length);
console.log("girl: " + secondChildIsAGirl.length);

The results you see here will vary. Sometimes you’ll get more boys than girls as the second child. Sometimes the opposite. Occasionally they’ll be dead even. We’ll call that 50%.

Why am I doing this?

This is firmly in the realm of recreational mathematics. I just find it really interesting to prove these things out even when my brain doesn’t agree 100% all the time. I enjoy doing it and I’ll probably do some more.

PS:

Here’s all the code in one spot:

let families = [];
// create 1000 families with two randomly gendered children.
for (let i = 0; i < 1000; i++) {
  // no kids yet.
  let family = "";

  // first child
  if (Math.random() < 0.5) {
    family += "b";
  } else {
    family += "g";
  }

  // second child
  if (Math.random() < 0.5) {
    family += "b";
  } else {
    family += "g";
  }

  families.push(family);
}
console.log(families);

// now lets get all the families who have at least one boy
let oneBoyFamilies = families.filter(family => family.indexOf("b") > -1);
console.log(oneBoyFamilies);

// validate that there are no families with two girls here
console.log(oneBoyFamilies.indexOf("gg"));

let otherChildIsABoy = oneBoyFamilies.filter(family => family === "bb");
let otherChildIsAGirl = oneBoyFamilies.filter(family => family === "bg" || family == "gb");
// alternately…
// let otherChildIsAGirl = oneBoyFamilies.filter(family => family.indexOf("g") > -1);

console.log("boy: " + otherChildIsABoy.length);
console.log("girl: " + otherChildIsAGirl.length);

// this time, lets get all the families where the FIRST child is a boy
let firstBoyFamilies = families.filter(family => family.charAt(0) === "b");
console.log(firstBoyFamilies);

let secondChildIsABoy = firstBoyFamilies.filter(family => family.charAt(1) === "b");
let secondChildIsAGirl = firstBoyFamilies.filter(family => family.charAt(1) === "g");

console.log("boy: " + secondChildIsABoy.length);
console.log("girl: " + secondChildIsAGirl.length);

2 thoughts on “Probability #1

  1. Did the book discuss the variant of the problem?

    A man states that he has two children and that at least one of them is a boy born on a Tuesday. What is the probability that the man has two boys?

Leave a Reply