Every four years football gives us the World Cup; the premier showcase of global talent. At the same time, the Italian company founded by the Panini brothers gives us something far greater. Panini produce a sticker album with hundreds of stickers to collect, representing the players, stadiums and nations at the World Cup. In doing so, they give us an intriguing mathematical challenge: how can we collect all 682 stickers as cheaply as possible?

Since the release of this tournament’s sticker album there have been several articles claiming to calculate the cost of completing the collection. Most of these articles make the same basic mistakes, and are hugely misleading. Always remember to check the assumptions and logical basis of any of these analyses. This post is based on the facts and assumptions listed below

1) There are 682 stickers to collect

2) Each pack contains 5 stickers and costs £0.80

3) Each sticker is equally likely to occur in any pack

4) Individual stickers can be purchased on the Panini website for £0.22 each

Point 4 is ignored by almost everyone who attempts to calculate the cost of completing the collection – but it makes a huge difference.

With this in mind, let’s get started. What is the simplest way that someone can collect all the stickers? Keep buying individual packs until you have them all! But how many packs do you need to buy? We can answer this through simulations. We use random numbers to randomly pick 5 from the 682 available stickers to create one pack of stickers. Then we count how many of those stickers are unique – not duplicates. We then randomly pick another 5 stickers to make another pack, and add those to our simulated collection. We can continue like this until our individual has collected all 682 stickers. The beauty of doing this on a computer is that we can then replicate this a large number of times, to work out how long it would take to complete the collection on average. I did this 100,000 times and have plotted the results below

I found that on average you would have to buy 968 packs to complete the collection! Which would cost a total of £774.40, more than most would be willing to spend on a football sticker album.

Why do we need to buy so many packs to complete the collection? Buying 968 packs would give us 4840 stickers, of which only a meagre 14% would be unique.

The best way of seeing why we have to buy so many packs is another simulation. But this time, instead of simulating the continual buying of packs until the 6820 total is reached, we will simulate the buying of a number of packs equal to n, where n is any number between 1 and 1000. We can then record the number of unique stickers we expect to get for each number of packs we buy. This is repeated 10000 times for each number of packs, to get a good estimate of the average number of unique stickers obtained from each number of packs bought.

As you can see, we get a nice curve. And this makes good logical sense. At the start, for each new pack we buy its quite likely that all the stickers we get will be new to us, and contribute to our collection. Towards the end of the curve when we have already got most of the collection, each new pack is likely to contain only stickers that we already have. Note that we already have most of the stickers when we have bought 500 packs, but on average it took an extra 468 packs purchased to complete the collection in our simulations from earlier!

At this point do we give up hope, or start saving up more than £700 and preparing to empty the local Sainsburys of its supply of stickers? Well not quite yet. At any point we can go online and order individual stickers, the ones we need to complete the collection. Why don’t we just do this from the beginning, you may ask. Well aside from that being blatant cheating, individual stickers cost £0.22 each online, compared to the £0.16 they cost when you buy them as part of a pack. However, it seems clear that with this information we no longer have to buy 968 packs. As we have already noted, by 500 packs we already have most of the collection.

So with this new information, the key question becomes ‘at what point should I stop buying packs, and buy individual stickers instead?’. We can easily calculate this by taking our figure above, for the number of unique items we get for each number of packs bought, and converting this into a final cost, but adding the cost of buying that number of packs to the cost of buying the number of stickers required to complete the collection.

The y-axis of this graph is cost, so we can look for the point where this is lowest to find the optimal number of packs, and the cost of buying that number of packs. The answer is 44 packs, for a total cost of £143.88. A much more reasonable answer than 968 packs for a cost of £774.40

So far we have been using simulations to answer this question, but can we answer the same question with simple maths? Intuitively we know that we are dealing with simple probabilities, the probability of getting any sticker by picking one at random is 1/682. Can we however use this to calculate how many unique stickers we get for each number of packs bought? Yes we can.

This kind of situation is known in maths as the ‘Coupon collectors problem’. Let’s use p to represent the probability of getting a new unique sticker to add to our collection for each new sticker we buy. The chance of getting a new unique sticker from our first one is 1, since we haven’t seen any other stickers so there is 0 chance for a duplicate. The chance of getting a new unique sticker from our second pick is 681/682, since there is one option that we could pick that might be a duplicate. Similarly, for the third pick the chance of adding to our collection is 680/682. This can be formalised to say that the probability of getting a new unique sticker on any draw i = 1 + (n-1)/n + (n-2)/n + … and so on until we have a number of terms equal to i. Can we use this simple maths to work out the time it takes to reach any defined number of unique cards? Yes we can! time and probability can be simple inverses of each other, if defined carefully. For example, if I said that the chance of rolling an even number on a dice is 1/2, we would expect to take 2 rolls to see an even. So we can use our probability equation to calculate the time t to collect any number of unique items = 1 + n/(n-1) + n/(n-2) … and so on. We must be careful to include the fact that we don’t buy individual items, we buy stickers in packs of 5, so we have to divide this time by 5 to convert it to number of packs. Using this we can recreate the mathematical version of the simulations we did earlier, and find it to be indistinguishable

Next, we can see if this gives a different answer to the ultimate question of how many packs we should buy

Here, we can see that this solution is exactly the same as our simulations! We should still be buying 44 packs and then buying the rest of the stickers online. Our simulations, and our simple maths, are in agreement.

So is that the final word, shall we put some of our savings back in the bank and prepare to spend £144? Well actually that’s not the end of it either.

We are a social species, and we love to trade. So I could bring in a partner to collect stickers with me. And we would both benefit because I could give her my swaps for free, and receive her swaps for free!

Before you continue, have a quick think and make a prediction about what effect this will have on the optimal number of packs we each need to buy to complete the collection. We will assume that both partners buy the same number of packs, and give all of their duplicates to the other player. Compared to the optimal number of packs for a player on their own, do you think we should buy more packs when playing with a partner, or less? Why do you think this?

For me, the easiest way to think about the effect of another player is that it gives us a loads of extra packs for free. For any number of packs we buy, we inevitably get a certain number of duplicates. Our partner has the same. Those duplicates are then packaged up and given to the other player as free packs. This has the effect of shifting the curve for the number of unique cards we get for any number of packs bought

In red here we can see how the black curve we had before shifts when we take into account all the free cards we get from our partner. So what is the effect on the optimal number of packs we should buy?

In red here we can see the new cost function, how much it costs to complete the collection for any number of packs bought (before doing swaps and buying the rest individually). Contrary to my original expectation we have to buy more packs when playing with a partner than we do when alone! 74 packs compared to 44 when playing on our own. The key however is that the total cost to complete the collection is lower; £136.20. Buying more packs each gives us more swaps to give to the other player, allowing us to obtain a larger collection before we have to buy the comparatively expensive individual stickers online for £0.22 each.

This calculation was based on our maths based calculations, but I’m a big fan of checking these using simulations, where we can be confident that we are recreating the effect we want to observe, and check if we have made any big maths mistakes. The graph below shows the average cost to complete the collection based on the number of packs bought when there are two players cooperating.

This simulation actually has some element of randomness, in this example 133 packs was on average the optimal number of packs to buy, but repeating this leads to slightly different answers. The important thing is the shape of the curve, which is lowest at around 74 packs, much higher than the 44 packs for a single player and in full agreement with our mathematical calculation.

So there we have it, you should grab a friend and each buy 74 packs, then give all your duplicates to your partner, take their duplicates, and buy the rest of the stickers online. Incidentally, you should expect to have to buy around 349 stickers online if you want to complete the collection for as little money as possible (these can be purchased on the Panini website in batches of up to 50). You may wish to compare your progress as you go against the mathematical expectation. By pure luck you could end up doing better than expected, and benefit from stopping buying packs much sooner to save more money.

The remaining question is what happens if you have not one partner but two or three, or even ten. I have used the same principle of swapping and simulated this scenario, with the result shown below.

Here we find that with 3 players we actually don’t end up buying any more or less packs than we do with 2 players. But why is this? shouldn’t adding more members to our syndicate of sticker collectors make a larger difference than this? Again we have to check our assumptions, and how realistic they are. Up until now we have instructed our simulated to players to use a simple swapping system, whereby each player simply passes all of their duplicates to the next player. We could visualize this by imagining a 10 player system where all players sit in a circle, and pass all their swaps to the player to their left. You can easily see the problem with this system. Many of the swaps that you get won’t be helpful to you, because you will already have that sticker. Players can be much savvier with their swapping, trading individual swaps to any of the other players for specific individual stickers that they still need. In this final section I have named this system ‘strategic swapping’ and enacted it by having all players pool all of their swaps together, and take turns strategically picking one sticker at a time that they need to further their collection.

I have done this for 2, 3, and 4 players and shown the main result below.

In black you can see the number of unique stickers achieved for each number of packs bought when collecting alone. In red, blue and green we see 2, 3, and four players respectively. As we add more players to our syndicate of strategic swapping collectors, we increase the number of unique stickers we get for any number of sticker packs we each buy. However, for each extra person we add, the benefit of adding that player is less than the benefit of adding the previous player.

How does this impact the amount of money needed to complete the collection? For two players we needed to purchase 74 packs and then swap and buy the rest online. For three players we need to purchase 82 packs, and for four players it’s 91 packs. Switching from two to four players brings the total cost down from about £136 to £130. A modest, but worthwhile saving.

As a final note, I have attached here (see the link below) a spreadsheet that all players can use to track their progress. For each pack purchased simply add in the total number of unique stickers you now own (without any swapping) and you can see how you compare to the mathematical expectation. You can see that my progress for the first few packs is right in line with that we expect. May the odds be ever in your favour.

Great job! Are you aware of similar research that has been referenced on Wikipedia? German version is here more detailed (see Sammelbilderproblem). The theory is broadly in line with your findings.

Did you contact any of the journalists reporting the misleading results from Prof Harper?

This is really excellent! Better than any other analysis I’ve seen. Nice job

Fantastic analysis. Are you making good progress on your own collection?

This is a really insightful analysis analysis. Very helpful. Thanks

A really intuitive and interesting example of the Coupon Collector’s Problem. Love it!