Purely out of my own curiosity, I was running through some math in my head this morning. If someone has a 40% success rate at some activity, then attempts two more trials, the first of which is a failure and the second of which is a success, what is flawed in the following logic: When the failure occurred, there were less total trials than there were when the success occurred, meaning the failure constituted a larger percentage of the total trials than did the success, and so the total average should decrease more than it should come back up? I realize this is inherently false; the total average will always increase. If you are earning a 40% in a class and then score a 50% on a two-question quiz, your grade will always increase, no matter which order the questions were asked. The most apt explanation I've been able to come up with is not to look at the 1/2 as a 50%, but instead to look at 0/1 and 1/1 separately, as 0% followed by 100%. Because the 0% is closer to the average (40%) than the 100% is, the 100% will have a greater effect on the average. In a set of data, an outlier will always effect the data more than a trial closer to the average, no matter which order the two trials were conducted. But what makes that logic more valid than my earlier 'logic'?

Yeah... you're overthinking it. Your "success rate" is based off some prior data. In the example that you have 40% in a class, and then get a 50% on a test... if the 40% is out of a million points overall, and your test was a total of 2 points... it's probably not going to show on your final grade. Also, you're using two different scales here. You have a percentage based on some number of previous trials, then you're saying a single instance of 0% and a single instance of 100% are going to shift the results more than the other. You are comparing a discrete scale vs a percentage, which doesn't really make sense. If you're going to do that, you should stay in the same scale: 2/5 -> 2/6 --> 3/7 This is "success rate" per trial. Also, it sounds like you're assuming the base number doesn't change (or something??). While it's true that since there are less instances in the first failure, the second even hasn't occurred yet, so it isn't being considered in the total sum. Once you move to the next step, there is a greater data set, and all points are weighted the same (since you're taking a running total).

Let's see... Suppose the 40% same from 4 successes in 10 trials. The 11th trial is a failure, we now have 4/11 = ..363636..., for a decrease of .03636... Then, the 12th trial is a success, 5/12 = .41666... for an increase of .0530303... It turns out that the decimal value of the 12th trial was larger than that of the 11th! This seems strange, b/c as the denominator grows, we'd expect the decimal value to shrink. I think the answer to this paradox is that a positive outcome is not always equal to a negative one. For example, suppose I'm 1/100, a further failure to 1/101 is only a change of .00009901, but a successful trial yields a 2/101 which is a change of .0098. BTW, positive changes aren't always larger than negative ones, the closer you are to 100%, the larger changes come from failures, but the closer you are to 0%, it's the other way around. Since 40% is closer to 0% than 100%, the absolute value of the success is larger, even though it has a greater denominator.

Probability is always used for the future. In the present, it is only relative frequency. If you want to revise the probabilities based on new data, then you need to apply Bayes theorem and posteriori and a priori probabilities. That is how signal estimation is done in presence of noise as more and more samples arrive.

But the OP didn't mention probability, he mentioned a paradox. A fail (1/n) followed by a success (1/n+1) seems to give greater weight to the second term. Why? Say a coach has won 9 of 10 games for a 90% win rate. If he wins the next game, his rate becomes 91%, it only goes up 1%. But if he loses, his rate will be 73%, it drops nearly 18%! This doesn't seem fair. With a high success rate, new wins only improve it a tiny bit, but just one loss makes a huge drop. This lack of symmetry between wins and losses explains the puzzle the OP discussed, IMO.

My friend, you are simply over thinking fractions. sureshs has is correctly in that you are mixing probability with actuality. The probability of you getting something right may be 40%. That's your initial statement. You're now wondering what's going on because GIVEN two new results, your relative success rate has decreased. That is completely independent of your 40% probability. If you want to use the two new trials, then, as surehs said, you need to recalculate your probability. Otherwise, the two trials are meaningless with regards to your probability of success. Let's use a fairly classic example of a coin toss (a Bernoulli trial where heads is the success with probability 50%). That 50% never changes no matter what. No matter how many times you toss that coin, it will always be just as likely to be heads as it is tails. Using this example, your paradox can be explained via the following two probabilities: 1. What is the probability of getting a heads on the 9th toss given you've gotten 8 tails previously? 1/2. All of the trials are independent, and so the probability of the heads has nothing to do with the probability of the prior 8 tails. 2. What is the probability of getting 8 tails and then getting a heads on the 9th toss? That is 9 * (1/2)^8 * (1/2) = 0.018, or about 2%. In this case, your tails have negatively affected what you're thinking to be your success rate when in reality, they have no effect whatsoever. The first one is what's really going on, and the second is what's confusing you as being the same thing when it's completely different.

Your example is true, but the OP is talking about probability. He says that you have a 40% success rate at something. That means the probability of you being successful is 40%. That is entirely different than you having been successful in 40% of your attempts as your example is stating. What you've said is what the OP is saying when he talks about a 40% average in a class being affected by getting a 50% on an assignment. That 40% is based on some prior successes and failures only. That is not the same thing as you being 40% likely to get a question right.

As you say, he says, "You have a 40% success rate." Does this mean that the intrinsic probability of success is 40% (you and Suresh). Or does it mean only that after a certain # of trials, he's logged a 40% success rate? (me). In the context of everything else he says I think my interpretation makes sense, but let's ask the OP. OP! Is this about fractions or probability? P.S. To the OP. What you say in the final paragraph is correct, i.e. a success has more weight b/c it is further from 40% than a fail. What's wrong with the earlier argument? It has its merits, but simply isn't the only factor.

I think the entire problem is that it's vague and not defined well. There's definitely a mixup of terminology and/or interpretation.

Well put. I presumed that given the way he worded it, you've got a 40% prob. of doing some activity correctly. But at the same time, it could also mean that you've been successful 20 out of the last 50 times you tried something.

Relative frequencies of the past are used as probabilities of the future, if mathematical models are not available. If the relative frequencies turn otu not to be useful (make too many incorrect predictions), time to ditch them.

8/11 =73%. But his rate would 9/11 =82%. You're right. Yet the point remains, increase of 1% from a win, decrease of 8% from a loss. It's asymmetric with respect to wins and loses. In the OP's example, even though the 'win' comes later, and is a smaller proportion of the total, @ 40% a 'win' causes a greater change than does a 'loss'.

I think in signal processing such things are handled by giving weights to new arriving data (smoothing them out with a filter) so that one new point cannot rock the boat, so to speak. So, here p would be "modulated" by a filtered version like p(n+1) = 0.95*p(n-1) + 0.05*p(n), with p(n-1) = 9/10, p(n) = 9/11, so that p(n+1) = 0.896