# very tricky Interview question

Chrix Wu

Ranch Hand

Posts: 121

posted 5 years ago

I was given an interview question yesterday:

in a company, employee are allowed to go for lunch between

and each people have only

John and Jack are work in the same company,

what is the

Please give your answer and reason!

in a company, employee are allowed to go for lunch between

**1:00pm and 2:00pm**,and each people have only

**20 minutes**for staying in the cafeteria,John and Jack are work in the same company,

what is the

**probability**that both people are in the cafeteria (even 1 second counts) ?Please give your answer and reason!

** SCJP 5.0 84% **

** SCWCD 1.5 76% **

Varun Khanna

Ranch Hand

Posts: 1400

posted 5 years ago

Its more complicated than that. This is the approach I could come up with in 5 minutes. I have no clue if it is right, so scrutinize as you please

The total number of possibilities for having lunch is 40.

0-20

1-21

2-22

...

40-60

If A and B have lunch, they can have lunch at any of these time slots. There are 3 distinct slots

There is a 19/40 chance of lunch intersecting (Assuming that 0-20 and 20-40 is an invalid intersection)

The probability of A and B meeting each other increases as we move from one slot to the next. If A arrives at 1-21, B will meet A for the slots 0-20 through 20-40 (20 chances). If A arrives at 2-22, B will meet A from 0-20 through 21-41 (21 chances). So the chances of not meeting A reduces with each try - 19,18,17... until A arrives at 20-40. It is hard to miss A here. If B arrives at 0-20 or 40-60, then B will miss A. Missing A is a linearly decreasing function till this point.

The opposite happens this time. When A comes at 22-42, B can miss A on slots 0-20 | 1-21 | 2-22. B cannot come at 43 because he/she cannot have lunch. From this point, the chances of a miss is a linear increasing function 2,3,4,5...19

So the summation N(N+1)/2 - 2 for N=19 gives you the total misses for all combinations. There are 40 possibilities for each attempt.

Not sure how to proceed from there. Do I multiply the individual probabilities ? Or is this approach going to fall like a house of cards

Varun Khanna wrote:Perhaps (20/60) * (20/60) = 0.11?

Disclaimer: been decade and a half since I worked on probability

Its more complicated than that. This is the approach I could come up with in 5 minutes. I have no clue if it is right, so scrutinize as you please

The total number of possibilities for having lunch is 40.

0-20

1-21

2-22

...

40-60

If A and B have lunch, they can have lunch at any of these time slots. There are 3 distinct slots

**Slot 0-20 and 40-60**

There is a 19/40 chance of lunch intersecting (Assuming that 0-20 and 20-40 is an invalid intersection)

**Slot 1-21 through 20-40**

The probability of A and B meeting each other increases as we move from one slot to the next. If A arrives at 1-21, B will meet A for the slots 0-20 through 20-40 (20 chances). If A arrives at 2-22, B will meet A from 0-20 through 21-41 (21 chances). So the chances of not meeting A reduces with each try - 19,18,17... until A arrives at 20-40. It is hard to miss A here. If B arrives at 0-20 or 40-60, then B will miss A. Missing A is a linearly decreasing function till this point.

**Slot 21-41 through 40-60**

The opposite happens this time. When A comes at 22-42, B can miss A on slots 0-20 | 1-21 | 2-22. B cannot come at 43 because he/she cannot have lunch. From this point, the chances of a miss is a linear increasing function 2,3,4,5...19

So the summation N(N+1)/2 - 2 for N=19 gives you the total misses for all combinations. There are 40 possibilities for each attempt.

Not sure how to proceed from there. Do I multiply the individual probabilities ? Or is this approach going to fall like a house of cards

Matthew Brown

Bartender

Posts: 4566

8

posted 5 years ago

Here's my approach.

First person takes their lunch at 1pm +

Second person takes their lunch at 1pm +

And I'm going to assume that x and y have an even probability distribution over that range (unlikely in practice, but we don't have anything else to go on).

The lunchtimes overlap if:

- for

- for

Now you need to average that over the range of

= [(20)^2/2 + 20(20)]/(20*40)

=

(Sorry about the notation - don't suppose JavaRanch supports TeX?)

If you can't assume the distribution is constant, then you need a probability distribution

First person takes their lunch at 1pm +

*x*, where*x*in [0, 40]Second person takes their lunch at 1pm +

*y*, where*y*in [0, 40]And I'm going to assume that x and y have an even probability distribution over that range (unlikely in practice, but we don't have anything else to go on).

The lunchtimes overlap if:

- for

*x*<= 20,*y*in [0,*x*+ 20], so probability of overlap is (*x*+ 20)/40- for

*x*>= 20,*y*in [*x*- 20, 40], so probability of overlap is (60 -*x*)/40Now you need to average that over the range of

*x*. By symmetry, the two conditions above provide the same contribution anyway, so we just need to calculate one of them.*P*= 1/20 integral{0, 20} (*x*+ 20)/40*dx*= [(20)^2/2 + 20(20)]/(20*40)

=

**3/4**(Sorry about the notation - don't suppose JavaRanch supports TeX?)

If you can't assume the distribution is constant, then you need a probability distribution

*f(x)*, and it will need a double integral.
Ulf Dittmer

Rancher

Posts: 42967

73

posted 5 years ago

I don't think that's the right approach. The question specifically said to take seconds into account as well.

It's also possible that a person will take less than 20 minutes (we don't know if that was allowed by the original question, though).

The total number of possibilities for having lunch is 40.

I don't think that's the right approach. The question specifically said to take seconds into account as well.

It's also possible that a person will take less than 20 minutes (we don't know if that was allowed by the original question, though).

posted 5 years ago

Ahh I see. I just noticed the "even 1 second counts" quote. In that case every slot, say 0-20, can be broken down to 60*20 = 1200 seconds. The slots would now be categorized by seconds. 0-1200, 1-1201 and so on. But for the units, the approach would then remain the same and should be valid.

I agree but disagree. I agree that a person could take less than 20 minutes, but the minimum lunch time is debatable. No on can have lunch in 1 second for example Assuming that A and B stay for exactly 20 minutes would simplify the problem.

Ulf Dittmer wrote:The total number of possibilities for having lunch is 40.

I don't think that's the right approach. The question specifically said to take seconds into account as well.

It's also possible that a person will take less than 20 minutes (we don't know if that was allowed by the original question, though).

Ahh I see. I just noticed the "even 1 second counts" quote. In that case every slot, say 0-20, can be broken down to 60*20 = 1200 seconds. The slots would now be categorized by seconds. 0-1200, 1-1201 and so on. But for the units, the approach would then remain the same and should be valid.

It's also possible that a person will take less than 20 minutes

I agree but disagree. I agree that a person could take less than 20 minutes, but the minimum lunch time is debatable. No on can have lunch in 1 second for example Assuming that A and B stay for exactly 20 minutes would simplify the problem.

Campbell Ritchie

Sheriff

Posts: 48652

56

Matthew Brown

Bartender

Posts: 4566

8

posted 5 years ago

Thanks for pointing that. Yep I am pretty sure there are other boundary mistakes in my post. I didnt double check any of the numbers. Just wanted to jot down an initial approach quickly.

Campbell Ritchie wrote:That makes 41, not 40. It's like indices in an array.Deepak Bala wrote: . . . The total number of possibilities for having lunch is 40.

0-20

1-21

2-22

...

40-60 . . .

Thanks for pointing that. Yep I am pretty sure there are other boundary mistakes in my post. I didnt double check any of the numbers. Just wanted to jot down an initial approach quickly.

Mike Simmons

Ranch Hand

Posts: 3040

10

posted 5 years ago

Matthew was right in the first place. I don't know why the rest of you are still talking about it. Seriously, counting discrete possibilities like that is just using an integer approximation to a continuous function - why bother?

If Jack shows up at 12:00, he has a 50% chance of meeting John. If he shows up later, his chance of meeting John increases linearly, up to a maximum of 100% if Jack shows up at 12:20 exactly. (Technically there is a minuscule chance that John shows up at exactly 12:00 or 12:40 and misses Jack, but this is too small a chance to be counted.) If Jack shows up after 12:20, his chances go back down again linearly, to 50% if he shows up at 12:40.

So, if Jack arrives between 12:00 and 12:20, he goes from a 50% chance of meetup to 100% chance, and the average is 75%. If Jack arrives between 12:20 and 12:40, he goes from 100% to 50%, and the average for this period is also 75%. Overall, the chance of meetup is 75%.

That's what Matthew's integrals are telling us. (Prior to the f(x) stuff.)

----

Although, since this is an interview question, I think it's worth exploring some other possibilities that avoid the simplifying assumptions that most of us have made.

Aside from some people eating in less than 20 minutes, there could be clustering effects. Maybe most people try to show up early each day, to get the best choice of food. Or maybe John or Jack despise lines, and show up late in order to avoid them. Maybe John and Jack are friends, and make a point of having lunch together often. Maybe they despise each other and choose completely different times for that reason.

Now all that sounds like it makes it impossible to determine the answer. And it does, without more data. But in an interview, we can ask questions, and imagine solutions based on gathering more data.

In particular, it seems to me that any company with such rigid rules about employee lunch times may well be tracking the times somehow. Perhaps with an electronic badging system that logs lunch start and stop times for each employee? There may well be a database or log file somewhere with real data in it. In which case, for a particular pair of employees Jack and John, we could write a query or program to determine how often they actually

We don't know which of these approaches the interviewer may be interested in. If I were the interviewer, I think they would

If Jack shows up at 12:00, he has a 50% chance of meeting John. If he shows up later, his chance of meeting John increases linearly, up to a maximum of 100% if Jack shows up at 12:20 exactly. (Technically there is a minuscule chance that John shows up at exactly 12:00 or 12:40 and misses Jack, but this is too small a chance to be counted.) If Jack shows up after 12:20, his chances go back down again linearly, to 50% if he shows up at 12:40.

So, if Jack arrives between 12:00 and 12:20, he goes from a 50% chance of meetup to 100% chance, and the average is 75%. If Jack arrives between 12:20 and 12:40, he goes from 100% to 50%, and the average for this period is also 75%. Overall, the chance of meetup is 75%.

That's what Matthew's integrals are telling us. (Prior to the f(x) stuff.)

----

Although, since this is an interview question, I think it's worth exploring some other possibilities that avoid the simplifying assumptions that most of us have made.

Aside from some people eating in less than 20 minutes, there could be clustering effects. Maybe most people try to show up early each day, to get the best choice of food. Or maybe John or Jack despise lines, and show up late in order to avoid them. Maybe John and Jack are friends, and make a point of having lunch together often. Maybe they despise each other and choose completely different times for that reason.

Now all that sounds like it makes it impossible to determine the answer. And it does, without more data. But in an interview, we can ask questions, and imagine solutions based on gathering more data.

In particular, it seems to me that any company with such rigid rules about employee lunch times may well be tracking the times somehow. Perhaps with an electronic badging system that logs lunch start and stop times for each employee? There may well be a database or log file somewhere with real data in it. In which case, for a particular pair of employees Jack and John, we could write a query or program to determine how often they actually

*do*have lunch together. This would factor in most all the unknowns that we've been avoiding. Although it would be worthwhile to look at historical trends as well - perhaps they*used*to be friends, but now avoid each other. Whatever. The point is, an answer based on gathering real data is likely to be far more accurate than an answer based on assumptions we make just because they simplify the problem.We don't know which of these approaches the interviewer may be interested in. If I were the interviewer, I think they would

*both*be relevant.
Matthew Brown

Bartender

Posts: 4566

8

Jimmy Clark

Ranch Hand

Posts: 2187

posted 5 years ago

What are they serving for lunch? If it is not tasty, then employees might not stay for 20 minutes, maybe they stay for 2 minutes.

Also, to get the actual probability that Jack and John are in the cafeteria at the same time, you would need to know how many employees will be going to the cafeteria for lunch, i.e. sample space.

Also, to get the actual probability that Jack and John are in the cafeteria at the same time, you would need to know how many employees will be going to the cafeteria for lunch, i.e. sample space.

posted 5 years ago

good point. How many people does the fire marshal allow in the cafeteria at any given time? If the second person arrives and the cafeteria is full, they have to wait until someone leaves - possibly more than one if there is a queue of people already waiting to get in.

There are only two hard things in computer science: cache invalidation, naming things, and off-by-one errors

Jimmy Clark

Ranch Hand

Posts: 2187

posted 5 years ago

We need the sample space to do the probability equations properly. This is true even without all of the assumptions mentioned.

If you were playing Roulette and wanted to know the probability of the ball landing in the #7 slot on a single spin, you would need to know how many slots there are on the wheel.

If John and Jack are the only employees that use the cafeteria, that is fine, the sample space is 2.

If 100 employees are using the cafeteria and we want know the probability of John and Jack in the cafeteria at the same time, that is fine, the sample space is 100. The probability value will be different than the one with a sample space of 2.

The question above as written does not contain enough information to actually get an accurate probability value, i.e. perform the

If you were playing Roulette and wanted to know the probability of the ball landing in the #7 slot on a single spin, you would need to know how many slots there are on the wheel.

If John and Jack are the only employees that use the cafeteria, that is fine, the sample space is 2.

If 100 employees are using the cafeteria and we want know the probability of John and Jack in the cafeteria at the same time, that is fine, the sample space is 100. The probability value will be different than the one with a sample space of 2.

The question above as written does not contain enough information to actually get an accurate probability value, i.e. perform the

**correct**probability calculations.
Mike Simmons

Ranch Hand

Posts: 3040

10

posted 5 years ago

I disagree - the other employees are irrelevant, unless they influence the behavior of Jack and John. (E.g. by overcrowding the cafeteria as Fred suggests.). But,

If the question had been whether Jack would encounter

*if*we can assume that both Jack and John choose their start times randomly in an even distribution from 12:00 to 12:40, and if we can assume that both Jack and John stay for exactly 20 minutes - then the other employees*do not*matter. The question was about whether*Jack*and*John*would be present in the cafeteria at the same time on a given day. Other employees are irrelevant.If the question had been whether Jack would encounter

*any*other employees, or whether Jack would encounter John*first*(before another employee did) - then those other employees would be relevant. As it is though, they aren't.
posted 5 years ago

(Do not take my comments seriously)

If they are allowed to

Doesn't ask about whether they are there simultaneously, or even whether they are there at lunchtime. They could be cooks and be there from 12pm through to 3pm. They could be the pencil-pushers who are counting every second that an employee spends in the cafeteria. They could be painters who are there for the entire day except for a 40 minute window.

Chrix Wu wrote:in a company, employee are allowed to go for lunch between1:00pm and 2:00pm,

If they are allowed to

**go**for lunch between 1pm and 2pm, then they could conceivably leave their desk at 2pm to go to the cafeteria, thereby increasing the time range.Chrix Wu wrote:what is theprobabilitythat both people are in the cafeteria (even 1 second counts) ?

Doesn't ask about whether they are there simultaneously, or even whether they are there at lunchtime. They could be cooks and be there from 12pm through to 3pm. They could be the pencil-pushers who are counting every second that an employee spends in the cafeteria. They could be painters who are there for the entire day except for a 40 minute window.

The Sun Certified Java Developer Exam with J2SE 5: paper version from Amazon, PDF from Apress, Online reference: Books 24x7 Personal blog

Chrix Wu

Ranch Hand

Posts: 121

posted 5 years ago

I don't think that is correct, imagine if their max lunch time is raised from 20 min to 31 min?

will the probability be 31/60 * 31/60 ? I dont think so, they will definitely meet each other.

Varun Khanna wrote:Perhaps (20/60) * (20/60) = 0.11?

Disclaimer: been decade and a half since I worked on probability

I don't think that is correct, imagine if their max lunch time is raised from 20 min to 31 min?

will the probability be 31/60 * 31/60 ? I dont think so, they will definitely meet each other.

** SCJP 5.0 84% **

** SCWCD 1.5 76% **

posted 5 years ago

- 1

The way I see it:

Assuming that the 20 minutes lunch time must fall in the interval 1 pm(inclusive) to 2 pm(exclusive)

Then each employee's lunch must start in the interval 1 pm(inclusive) to 1:40 pm(exclusive)

That is, in an interval of 40 * 60 seconds.

They are in the cafeteria at the same time if the difference of the start times of their lunch is less than 20 * 60 seconds

This resolves to finding the probability of the difference of two random numbers in the interval [0..40*60] being less than 20*60

I'm still waiting for the run to complete, this machine is an old P4-2.4 ... ok, it comes up with 74.98%. I would guess that a rigorous solution using probability theory might result in 75%.

Most appropriately, this was done during my lunch break

Assuming that the 20 minutes lunch time must fall in the interval 1 pm(inclusive) to 2 pm(exclusive)

Then each employee's lunch must start in the interval 1 pm(inclusive) to 1:40 pm(exclusive)

That is, in an interval of 40 * 60 seconds.

They are in the cafeteria at the same time if the difference of the start times of their lunch is less than 20 * 60 seconds

This resolves to finding the probability of the difference of two random numbers in the interval [0..40*60] being less than 20*60

I'm still waiting for the run to complete, this machine is an old P4-2.4 ... ok, it comes up with 74.98%. I would guess that a rigorous solution using probability theory might result in 75%.

Most appropriately, this was done during my lunch break

luck, db
*There are no new questions, but there may be new answers.*

Jimmy Clark

Ranch Hand

Posts: 2187

posted 5 years ago

- 1

Just ran it again at home up to Integer.MAX_VALUE and still got 74.98. I changed the code to use BigDecimal to rule out floating point errors, and ran it though 10 iterations.

The final code:The output:

The final code:The output:

luck, db
*There are no new questions, but there may be new answers.*

Jimmy Clark

Ranch Hand

Posts: 2187

Jimmy Clark

Ranch Hand

Posts: 2187

posted 5 years ago

lol

Using the possible time slots as the key measure to determine the "probability" that Jack and John will be in cafeteria at the same time would strongly depend upon a narrow and strict constraint that employees are in the cafeteria for all of their allotted 20 minutes. However, in

The equation to determine a

x = John and Jack in cafeteria at same time

y = number of ways employees can be present in cafeteria at same time

P(x) = n(x) / n(y)

Using the possible time slots as the key measure to determine the "probability" that Jack and John will be in cafeteria at the same time would strongly depend upon a narrow and strict constraint that employees are in the cafeteria for all of their allotted 20 minutes. However, in

**real-life**, some employees may come in for 5 minutes, some will come in for 10 minutes and some may come in for 15 or 19 minutes, some may come in for 2 minutes.The equation to determine a

**real-world "statistical" probability**would be different, and would require quantitative data reflecting employee usage of the cafeteria.x = John and Jack in cafeteria at same time

y = number of ways employees can be present in cafeteria at same time

P(x) = n(x) / n(y)

Mike Simmons

Ranch Hand

Posts: 3040

10

posted 5 years ago

I believe many people have already acknowledged the real-life considerations that make this much more complex. What I'm not clear on though: when you talk about the importance of knowing the total number of employees, do you believe it has importance

(a) Each employee picks a start time randomly in the range 12:00-12:40 with a uniform probability distribution, and

(b) Each employee stays for exactly 20 minutes of lunch.

Sure, there are various reasons these will not be true in the real world, we all agree on that. But my question is, if they

*even given*the simplifying assumptions stated above by several people, namely:(a) Each employee picks a start time randomly in the range 12:00-12:40 with a uniform probability distribution, and

(b) Each employee stays for exactly 20 minutes of lunch.

Sure, there are various reasons these will not be true in the real world, we all agree on that. But my question is, if they

*were*true (or were acceptably close to being true), would you still insist on the importance of knowing the number of employees?
Campbell Ritchie

Sheriff

Posts: 48652

56

posted 5 years ago

It is not if there is a limit to the number of employees allowed in the cafeteria at any given time.
There are only two hard things in computer science: cache invalidation, naming things, and off-by-one errors

Campbell Ritchie wrote:I would have thought the probability of the two meeting is independent of the number of employees.

It is not if there is a limit to the number of employees allowed in the cafeteria at any given time.

Campbell Ritchie

Sheriff

Posts: 48652

56

posted 5 years ago
There are only two hard things in computer science: cache invalidation, naming things, and off-by-one errors

an I think that is the ultimate point of questions like this. They are less about "getting the right answer" than how you approach the problem, what assumptions you make, whether you realize you are making any assumptions, etc.

IF i asked a question like this, I would want the interviewee to talk through the problem, ask these very questions, sketch out an approach, etc. Anyone who just says "The answer is X" gets lower marks than someone who never comes up with an answer, but picks away at the requirements and unknowns.

IF i asked a question like this, I would want the interviewee to talk through the problem, ask these very questions, sketch out an approach, etc. Anyone who just says "The answer is X" gets lower marks than someone who never comes up with an answer, but picks away at the requirements and unknowns.

Jimmy Clark

Ranch Hand

Posts: 2187

posted 5 years ago

To get the "statistical" probability of two specific employees being in the cafeteria in real-life, you would need to gather data. You would collect employee names, frequencies, durations, etc. With this data you could then determine the "probability" of two specific employees being in the cafeteria at the same time, using known "statistical" probability equations.

x = John and Jack in cafeteria at same time

y = number of ways employees can be present in cafeteria at same time

P(x) = n(x) / n(y)

The equation above would be the final one. There would be other required equations to get the values for x and y.

So, it is not so much about the number of employees, it is the number of combinations possible that matters, e.g. the y value, and here you will know the # of employees.

Aside, in terms of the math puzzles above with the time solts, this is not "statistical" probability and the number of employees or any other data would not matter. All you need is the time slots and the unrealistic constraints mentioned. Not very helpful if I need to make multi-million dollar business decisions about my employees cafeterias.

x = John and Jack in cafeteria at same time

y = number of ways employees can be present in cafeteria at same time

P(x) = n(x) / n(y)

The equation above would be the final one. There would be other required equations to get the values for x and y.

So, it is not so much about the number of employees, it is the number of combinations possible that matters, e.g. the y value, and here you will know the # of employees.

Aside, in terms of the math puzzles above with the time solts, this is not "statistical" probability and the number of employees or any other data would not matter. All you need is the time slots and the unrealistic constraints mentioned. Not very helpful if I need to make multi-million dollar business decisions about my employees cafeterias.

Mike Simmons

Ranch Hand

Posts: 3040

10

posted 5 years ago

Anyway, even for the "real world" version of the problem, I had already talked about the importance of gathering actual data. This is not a new idea. And frankly, the total number of employees STILL DOES NOT MATTER. (At least not directly - not as something we have to measure or think about.*) Those other employees aren't the "sample space" - they're irrelevant. The sample space is the total number of days for which you have records of when people went to lunch. Other than that, all you need to know is, how many times did Jack and John actually meet? The probability is then estimated as

P = (number of days when actually J&J met for lunch, according to records) / (number of days in the records)

Well, we might want to limit this to some range of recent history. Certainly exclude any records before J&J were hired, or before current cafeteria rules were in effect. And it would be interesting to know if P changes significantly if we only look at the last month, or if we include their whole history.

* The important thing here is that by measuring the end result - did they meet or not? - we implicitly include

P = (number of days when actually J&J met for lunch, according to records) / (number of days in the records)

Well, we might want to limit this to some range of recent history. Certainly exclude any records before J&J were hired, or before current cafeteria rules were in effect. And it would be interesting to know if P changes significantly if we only look at the last month, or if we include their whole history.

* The important thing here is that by measuring the end result - did they meet or not? - we implicitly include

*all*the more complicated effects that might have contributed to this outcome. Even if there*is*cafeteria overcrowding, for example, its effect has already been included in the final outcome, which is what we're measuring.
palanisamy subramani

Greenhorn

Posts: 29

posted 5 years ago

John can come to cafeteria 2400 possible times to complete his Lunch in an hour. Each second is one possibilty.

The same applicable for jack also.

So total possible no of times, jack and john in Cafeteria is 4800.

Out of that, 1200 times they cannot have lunch together.

So probability to have lunch together is 75%

The same applicable for jack also.

So total possible no of times, jack and john in Cafeteria is 4800.

Out of that, 1200 times they cannot have lunch together.

So probability to have lunch together is 75%