eleven
$\begingroup$

Today in my laundry basket I had 10 distinct pairs of socks. I repeatedly take a sock randomly from my basket. If it matches a sock in my lap, I pair both up and put them aside, otherwise I put it on my lap. On average, how many socks do I need to take before I match up a pair?

Bonus (original question): On average, what is the maximum number of socks in my lap?

(Inspired by a true story.)

$\endgroup$
nine
  • two
    $\begingroup$ I'm confused by the question, "On average, what is the maximum". Do you mean that if you repeated this experiment many times and recorded the maximum each time, then averaged all of those maximum values? $\endgroup$ Commented Jun 13 at 17:51
  • one
    $\begingroup$ @GentlePurpleRain yes, it's a frequentist question. $\endgroup$
    –  qwr
    Commented Jun 13 at 18:02
  • one
    $\begingroup$ Do you have reason to think there's a nice answer to this? $\endgroup$
    –  xnor
    Commented Jun 13 at 19:51
  • one
    $\begingroup$ @z100 In you first scenario (where you pick a match every time), you have a maximum of 1 in your lap, not 0, since each sock has to sit in your lap until its match is drawn. It is impossible to get 0. $\endgroup$ Commented Jun 13 at 20:19
  • two
    $\begingroup$ @xnor nice? no. But the easier problem of determining the average number of socks that need to be pulled before a pair arises I think has a nice recursion. Maybe I should ask that one instead. $\endgroup$
    –  qwr
    Commented Jun 13 at 20:50

5 Answers five

Reset to default
seven
$\begingroup$

Let $N$ be the number of distinct pairs of socks, let random variable $X_k$ be the number of socks drawn to get the first pair when $k$ socks are in your lap, and let $E_k=\mathbb{E}(X_k)$ . We want to compute $E_0$ . We derive a recursion by conditioning on the next sock drawn when you have $k$ in your lap. With probability $\frac{k}{2N-k}$ , the next sock is a match and you are done. With probability $1-\frac{k}{2N-k}=\frac{2N-2k}{2N-k}$ , the next sock is not a match and you have $k+1$ socks in your lap. Hence we obtain recursion $$E_k = \begin{cases} 0 &\text{if $k > N$}, \\ 1 + \frac{k}{2N-k} \cdot 0 + \frac{2N-2k}{2N-k} E_{k+1} &\text{otherwise}. \end{cases} $$ For $N=10$ , the values are approximately: \begin{matrix} k & E_k \\ \hline 0 & \color{red}{5.6754638550} \\ 1 & 4.6754638550 \\ 2 & 3.8796562914 \\ 3 & 3.2396133278 \\ 4 & 2.7195304695 \\ 5 & 2.2927072927 \\ 6 & 1.9390609391 \\ 7 & 1.6433566434 \\ 8 & 1.3939393939 \\ 9 & 1.1818181818 \\ 10 & 1 \\ \ge 11 & 0 \end{matrix}


For the bonus question, let random variable $Y_{k,b,m}$ be the maximum number of socks in your lap to get all pairs when $k$ socks are in your lap, $b$ are in the basket, and the current maximum is $m$ , and let $E_{k,b,m}=\mathbb{E}(Y_{k,b,m})$ . We want to compute $E_{0,2N,0}$ . We derive a recursion by conditioning on the next sock drawn when you are in state $(k,b,m)$ . With probability $\frac{k}{b}$ , the next sock is a match and you move to state $(k-1,b-1,m)$ . With probability $1-\frac{k}{b}$ , the next sock is not a match and you move to state $(k+1,b-1,\max(m,k+1))$ . Hence we obtain recursion $$E_{k,b,m} = \begin{cases} m &\text{if $b=0$}, \\ \frac{k}{b} E_{k-1,b-1,m} + \left(1-\frac{k}{b}\right) E_{k+1,b-1,\max(m,k+1)} &\text{otherwise}. \end{cases} $$ For various $N$ , the values are approximately: \begin{matrix} N & E_{0,2N,0} \\ \hline 5 & 3.5735449735 \\ 10 & \color{red}{6.4892979634} \\ 20 & 12.026472105 \\ 100 & 53.914792025 \\ \end{matrix}

$\endgroup$
two
  • $\begingroup$ I confirm this answer. :) This is a really nice way of doing it. Perhaps you can explain the "conditioning on the next sock drawn" part in more detail for future readers $\endgroup$ Commented Jun 13 at 22:28
  • $\begingroup$ Yes, this is the intended solution. You can calculate it exactly as a fraction too. $\endgroup$
    –  qwr
    Commented Jun 13 at 22:43
four
$\begingroup$

TL;DR

The answer for 10 pairs is 5.675463855030418 socks

Let's first consider the case with only 2 pairs.

It's impossible to make a pair when you take one sock, so the smallest amount you have to take is 2 socks. What's the probability of making a pair at the second sock? It's 1/3 since after the first one there is only 3 left in the basket. Then, after having a second sock that doesn't match, the third draw will guarantee a match since the next sock will 100% pair with either sock A or sock B.

In other words, for the scenario with 2 pairs, you have 1/3 chance of drawing twice and 2/3 chance of drawing thrice. On average, that's drawing 2*(1/3) + 3*(2/3) = 2.67 socks before you make a pair.

Now let's extend to three pairs.

After drawing the first one, the chance to make a pair on the second one is 1/5, and if not made, the chance to make a pair on the third one is 2/4, if still not made, the forth draw guarantees a pair since you already have three different socks in your lap.

When we aggregate the possibilities, you have 1/5 chance to draw only 2 socks, (4/5)(2/4) chance to draw only three, and then (4/5)(2/4)(1) to draw four socks. On average, that's 2(1/5) + 3(4/5)(2/4) + 4(4/5)(2/4)(1) = 3.2 socks you'll have to draw to get a match.

Let's do one more: 4 pairs.

Again, the first sock cannot make a pair, so we start from the second. The chance to get a pair on the second draw is 1/7, the chance to get a pair on third draw is (6/7)(2/6), the chance to get a pair on the fourth draw is (6/7)(4/6)(3/5), then if we still don't have a pair, the fifth draw guarantees a pair.

On average, that will be 2(1/7) + 3(6/7)(2/6) + 4(6/7)(4/6)(3/5) + 5(6/7)(4/6)(2/5) = 3.66 socks to make a pair.

Now, let's generalize that into p pairs.

From the patterns above, we can know that with p pairs of socks, on average the number of draws needed can be expressed with the following function D(p) where D(p) is defined as D(p) = 2(1/2p-1) + 3(1 - 1/2p-1)(2/2p-2) + 4(1 - 1/2p-1)(1 - 2/2p-2)(3/2p-3) + ... + p(1 - 1/2p-1)(1 - 2/2p-2)(1 - 3/2p-3)... ((p-1)/(2p-(p-1))) + (p+1)p(1 - 1/2p-1)(1 - 2/2p-2)(1 - 3/2p-3)... (2/2p-(p-1)) We can also rewrite this using the sigma notation: $$D(p) = \sum_{i=2}^{p+1} i(1-\frac{2-1}{2p-(2-1)})(1-\frac{3-1}{2p-(3-1)})... (\frac{i-1}{2p-(i-1)}) $$ Plugging in p=10, we get roughly 5.675463855030418 socks

$\endgroup$
four
$\begingroup$

We'll work a slightly more general case of having $N$ distinct pairs of socks total. The specific problem posed is $N=10$ .

Denote the state $n$ as the state of having $n$ socks in your lap. When you are in state $n$ , there are $2N-n$ socks left in the pile, out of which there are $n$ candidates that would form a pair. So, starting from state $n$ , the probability of finding a pair is $\frac{n}{2N-n}$ , and the probability of moving to state $n+1$ is $\frac{2N-2n}{2N-n}$ .

It follows that the probability of reaching state $n$ is: $$\prod_{k=1}^{n-1}\frac{2N-2k}{2N-k}$$ The probability of ending in state $n$ (that is, having $n$ socks in your lap when you find a pair) is hence: $$P(n)=\frac{n}{2N-n}\prod_{k=1}^{n-1}\frac{2N-2k}{2N-k}$$ Here is a plot of $P(n)$ for $N=10$ as well as $N=100$ : probability distribution Note that $n$ ranges from $1$ to $N$ , inclusive. This is because you must already have a sock in order to be able to form a pair, and if you already have $N$ distinct socks in your lap, the next sock is guaranteed to match. So, the expected value of the number of socks you have to take (which is $n+1$ , since the last sock is in your hand instead of your lap) when getting a pair is: $$1+\langle n\rangle=1+\sum_{n=1}^NnP(n)=\boxed{1+\sum_{n=1}^N\frac{n^2}{2N-n}\prod_{k=1}^{n-1}\frac{2N-2k}{2N-k}}$$ Specializing to the case $N=10$ , we have: $$1+\langle n\rangle=1+\sum_{n=1}^{10}\frac{n^2}{20-n}\prod_{k=1}^{n-1}\frac{20-2k}{20-k}=\frac{262144}{46189}\approx5.675$$ For what it's worth, the standard deviation can be calculated as well, so the average number of socks taken when you draw a pair is: $$\boxed{\frac{262144}{46189}\pm\frac{\sqrt{8776150330}}{46189}\approx5.675\pm2.028}$$ Here is a plot of the scaling for values of $N$ through $1000$ (shaded area is $\pm1\sigma$ ): asymptotic scaling This suggests that, for large values of $N$ , the expected value $\langle n\rangle$ scales as: $$\langle n\rangle=O(\sqrt{N})$$

$\endgroup$
six
  • $\begingroup$ This is strange, because both my answer and @RobPratt 's answer gives around 5.675 socks for 10 pairs, but yours say 4.675 socks. I wonder where is the mistake (in your or in my answer) $\endgroup$
    –  dvx2718
    Commented Jun 13 at 22:39
  • one
    $\begingroup$ There is no mistake. Just a different interpretation of the question. I don't count the final sock in my answer. $\endgroup$ Commented Jun 13 at 22:40
  • $\begingroup$ Well the question as written asks for socks that need to be taken. $\endgroup$
    –  qwr
    Commented Jun 13 at 23:32
  • $\begingroup$ @qwr See my edited answer. $\endgroup$ Commented Jun 14 at 0:00
  • one
    $\begingroup$ "When you are in state n, there are 2N−n socks left in the pile" => doesn't this fail to account for the fact that some socks may already have been matched, and are thus neither in your lap or in the pile? $\endgroup$ Commented Jun 14 at 12:57
three
$\begingroup$

This answer solves the bonus question, of what the average maximum number of socks in your lap is.

The answer is

6.489 for 10 pairs of socks

Python code here .

I solved this with a dynamic programming method. Consider a complete sequence of socks drawn. At certain points in the sequence, you have k socks in your lap for the first time (for that particular k). These points in the sequence are called "key. " The "signature" of a sequence is the set of key indices, which we can represent with a binary number where the key indices are set to 1.

If we count how many ways there are to form each possible signature with N pairs of socks, we can use this to determine how many ways there are to form each possible signature with N+1 pairs of socks. We do this by taking each existing signature, and adding a new pair of socks to it, with the first sock going at the beginning and the second sock going anywhere after it in the sequence. The effect this has is to add a new key point for the first sock, and delete the first key point that follows the second sock.

By this means, it is possible to find the distribution of signatures for each N, working our way up from N=1 to N=10, from which the answer can be read off.

This won't work for N too large, because the number of possible signatures is exponential in N. However, it can feasibly do N=20, for which the answer is 12.026 (requiring 524288 different signatures). This is a lot better than naive brute force iteration over all possible sock sequences, which for N=20 would involve 319830986772877770815625 sock sequences, far too large to iterate over!

My program agrees with gannolloy's answer, that the figure for 5 pairs is

three point five seven

$\endgroup$
two
  • $\begingroup$ Good work. I will go through the logic. btw you should leave the code in your answer as it's short enough. $\endgroup$
    –  qwr
    Commented Jun 15 at 15:33
  • $\begingroup$ Thanks for verifying, nice work $\endgroup$
    –  gannolloy
    Commented Jun 17 at 23:38
two
$\begingroup$

Answered the bonus question since I started working on it before OP edited. Also, by misreading the question I did the work for 10 total socks, instead of 10 pairs of socks. Maybe someone can use this to find the actual answer. The average maximum number of socks in your lap by the time all 5 pairs are matched is:

three point five seven

To start, you take one sock out of the basket and place it on your lap. From there, the total number of possible paths to 5 matched pairs is:

9! or 362880.

Then we get the number of possible paths that lead to a given maximum:

5 is the 'maximum' maximum, since if there are 5 unpaired socks in your lap, every subsequent sock you pull from the basket will have a match in your lap.

46080 paths lead to a maximum of 5. 147456 paths lead to 4, 138240 paths lead to 3, 30720 paths lead to 2, 384 paths lead to 1.

Then we take the sum of the products of each maximum and its respective number of paths, and divide the sum by the total paths:

46080 * 5 = 230400; 147456 * 4 = 589824; 138240 * 3 = 414720; 30720 * 2 = 61440; 384 * 1 = 384; Sum of the above = 1296768

Weighted Total / Total Paths = 1296768 / 362880 = 3.5735

Let me know if I made a mistake, besides the obvious misreading the question. I don't think the answer for 10 pairs is as simple as doubling this one, but it may be.

Edit: formatting

$\endgroup$
six
  • one
    $\begingroup$ By my method, I am getting an average of $\frac{193}{63}\approx3.063$ for 5 pairs of socks $\endgroup$ Commented Jun 13 at 21:55
  • $\begingroup$ @DanDan noodles is that the number of socks in your lap before finding a pair, or the maximum number of socks in your lap by the time you match all 5 pairs? $\endgroup$
    –  gannolloy
    Commented Jun 13 at 22:11
  • $\begingroup$ The number of socks in your lap, at the moment when you are holding a matching sock in your hand (which is not in your lap). So, if I drew 2 socks and then on the 3rd draw I find a matching sock, I count that as 2 socks in your lap. Of course, to get the other interpretation, just add 1. $\endgroup$ Commented Jun 13 at 22:20
  • $\begingroup$ @DanDan Noodles Got it, thanks. My answer was looking at the bonus question, 'On average, what is the maximum number of socks in my lap?' $\endgroup$
    –  gannolloy
    Commented Jun 13 at 22:23
  • $\begingroup$ My bad! You are right, sorry for misreading your answer. $\endgroup$ Commented Jun 13 at 22:23

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy .

Not the answer you're looking for? Browse other questions tagged or ask your own question .