1. Standard memberroyalchicken
    CHAOS GHOST!!!
    Elsewhere
    Joined
    29 Nov '02
    Moves
    17317
    25 Jun '03 16:56
    Originally posted by Acolyte


    Also, could you give me the objective function that the hiring officer is trying to maximise? It can't be as simple as 'probability of best candidate being selected', because then he would just interview everyone so he could be sure of finding the best one.
    I still don't understand the merits of this strategy.

    This is because I haven't explained it properly...😳.

    The officer INTERVIEWS the first M, and rejects them out of hand. He then interviews the rest, and selects the first one to have a higher score on the objective criteria than the rest. If such a candidate does not exist, he hires the last candidate by default (that is what I forgot to mention). Furthermore, remember the crucial detail: A PROSPECTIVE PRANCER MUST BE ACCEPTED OR REJECTED IMMEDIATELY FOLLOWING HIS INTERVIEW (before others are interviewed). Given this strategy, I am asking you to tell me what value of M maximizes the probability that the best candidate is chosen. M is different from zero. Also, given that this is done properly, what is the probability that the best candidate is hired, and where should you put yourself in line. I think that I may have left the cruciall detail out of my initial explanation, which gives rise to the problems you astutely mentioned.

    Your second objection is answered by this detail. Sorry about the lack of clarity.
  2. Joined
    26 Apr '03
    Moves
    26771
    26 Jun '03 08:30
    The best candidate can choose where to stand. He knows that the interviewer wants to minimise his workload. Therefore the cleverest candidate stands at the back. Interviewer discards the first M-2 people, interviews the second to last and hires the last one. Since only a very good candidate will work out that it is best to stand at the back this hiring technique will work admirably.
  3. DonationAcolyte
    Now With Added BA
    Loughborough
    Joined
    04 Jul '02
    Moves
    3790
    26 Jun '03 12:52
    Originally posted by royalchicken
    The officer INTERVIEWS the first M, and rejects them out of hand...
    Oh right, now it makes sense. So the quality of the last candidate is irrelevant, and if the best candidate of the other N-1 candidates is one of the first M, the hiring officer is bound to choose the last candidate. Is that right?
  4. Standard memberroyalchicken
    CHAOS GHOST!!!
    Elsewhere
    Joined
    29 Nov '02
    Moves
    17317
    26 Jun '03 16:04
    Originally posted by Acolyte
    Oh right, now it makes sense. So the quality of the last candidate is irrelevant, and if the best candidate of the other N-1 candidates is one of the first M, the hiring officer is bound to choose the last candidate. Is that right?
    That is correct. Sorry I forgot to tell you that detail before. Some linguistic rigidity on my part or some such.
  5. Joined
    26 Apr '03
    Moves
    26771
    26 Jun '03 22:58
    Ok - my thoughts so far: (mucho apologies about previous attempt)
    Forget about the instantly rejected people, for now

    Given his inefficient search procedure (without the rejection step), say the probability that the interviewer will select the best candidate from n is f[n]

    The probability that, with the rejection step, the interviewer will select the best candidate from the total group G = (G-n)/G*f[n]
    (i.e. that probability that the best candidate is not in the rejected batch, and is then selected)

    f[n] is a summation for all p from 1..n of the values of function g[p,n], which gives the probability that, if the best candidate is at position p within the group of n, the search procedure will find him.

    how does g[p,n] work:
    well
    g[1,n] = 0 (for n =/ 1) and 1 (for n = 1)
    g[2,n] = 1 for all n > 1
    g[x,n], where x > 2, is more complicated. It is a summation of functions h[b,c,d] where h[b,c,d] is the probability that a random selection of c candidates from a set of d will rank worse than b

    b ranges from 1..n-1, and represents the ranking of the first candidate within the group (with better candidates higher)
    c = p - 2, and represents the number of candidates between the first and p
    d = n - 2, and is the total number of candidates, excluding the best candidate and the first candidate

    I'm getting there, but I am going to bed now. Perhaps someone else can carry this on?


  6. Standard memberroyalchicken
    CHAOS GHOST!!!
    Elsewhere
    Joined
    29 Nov '02
    Moves
    17317
    27 Jun '03 01:07
    Okay thus far....(Russ, we need a 'cryptic' smiley.)?
  7. Joined
    26 Apr '03
    Moves
    26771
    27 Jun '03 08:37

    I'm getting there, but I am going to bed now. Perhaps someone else can carry this on?
    A little bit more before I go to work... I didn't need parameter c
    h[b,d] reresents the probability that a random selection of candidates from d rank worse than candidate b,

    obviously thare are a total of d-b candidates in d that rank worse than b. The total number of ways of arranging these d-b candidates is (d-b)!
    (where ! means factorial)
    the total number of arrangements of d-b candidates selected from the total set d-1 is (d-1)!/(b-1)! (I take 1 from d because I am excluding the b'th candidate)

    so h[b,d] = (d-b)!.(b-1)!/(d-1)!

    Wow - I'll have to look up binomial expansions, I only did a chemistry degree.
  8. Standard memberroyalchicken
    CHAOS GHOST!!!
    Elsewhere
    Joined
    29 Nov '02
    Moves
    17317
    28 Jun '03 02:32
    A little more work than is really necessary actually...
  9. DonationChris
    Site Admin
    Wimbledon
    Joined
    21 Feb '01
    Moves
    26275
    01 Jul '03 16:151 edit
    Coincidentally, after listening to Jon Tickle's theory of women on BB4, I wrote a little program to test his hypothesis (that given 30 women, you look at the first 10 you meet, and then select the next woman you meet who is better than any you have previously met!!)

    This is the same problem in a different wrapper, so I have some ready-made results...

    Using N=30, I tried this for each value of M from 0 to 29 - and for each value, did 100 runs and calculated the percentage of times the best 'candidate' was selected.

    This shows, for example, that just interviewing the first 3 candidates and then selecting the next candidate that is better than any of the first three (or the last candidate if the best candidate was in the first three), the quality of the candidate will be approximately 82.15% (if all candidates have a random quality between 0 and 99). Jon's theory that M=10 was optimal is shown to be questionable by this sample, which produced a 79.18% quality rating...

    0 = 49.32
    1 = 69.44
    2 = 80.92
    3 = 82.15
    4 = 82.03
    5 = 81.26
    6 = 82.58
    7 = 80.95
    8 = 83.75
    9 = 78.60
    10 = 79.18
    11 = 73.39
    12 = 80.25
    13 = 76.59
    14 = 74.81
    15 = 73.37
    16 = 75.39
    17 = 66.94
    18 = 65.13
    19 = 61.22
    20 = 64.87
    21 = 61.36
    22 = 54.48
    23 = 53.23
    24 = 57.24
    25 = 54.51
    26 = 60.61
    27 = 51.65
    28 = 53.25
    29 = 50.67
  10. Standard memberroyalchicken
    CHAOS GHOST!!!
    Elsewhere
    Joined
    29 Nov '02
    Moves
    17317
    01 Jul '03 22:302 edits
    Quite impressive. For N=30, 11 is about the optimum M. I like your version of the problem better too, never having been one to hang around guys in purple tights...

    I think you've got it. Can you tell me exactly how it works, though?
  11. Standard memberroyalchicken
    CHAOS GHOST!!!
    Elsewhere
    Joined
    29 Nov '02
    Moves
    17317
    15 Jul '03 15:221 edit
    Chris, maybe offer an accolade for the complete explanation of this one...let's see if the Answer Prancer will deliver 😉.

    Also, can someone explain the fairly subtle reason why the theoretical answer (M=11 for N=30) disagrees with Chris's data? The program did not quite preserve everything.
  12. DonationChris
    Site Admin
    Wimbledon
    Joined
    21 Feb '01
    Moves
    26275
    17 Jul '03 12:56
    Originally posted by royalchicken
    Chris, maybe offer an accolade for the complete explanation of this one...let's see if the Answer Prancer will deliver 😉.

    Also, can someone explain the fairly subtle reason why the theoretical answer (M=11 for N=30) disagrees with Chris's data? The program did not quite preserve everything.
    To help, here is the code that was used...

    ITEMS = 30
    SAMPLES = 1000

    FOR M = 0 TO ITEMS - 1

    TOTAL = 0

    * For each value of M, take a reasonable number of samples

    FOR LOOP1 = 1 TO SAMPLES
    ITEM = 0
    BEST = 0

    * Look at the first 'M' items and find the best

    FOR LOOP2 = 1 TO M
    ITEM = RND(100)
    IF ITEM > BEST THEN
    BEST = ITEM
    END
    NEXT LOOP2

    * Look at the remaining items and stop if any is better than BEST,
    * otherwise take the last item

    FOR LOOP3 = 1 TO (ITEMS - M)
    ITEM = RND(100)
    IF ITEM > BEST THEN

    * Exit Loop

    LOOP3 = ITEMS + 1
    END
    NEXT LOOP3

    * Add the score of this item to the current total for 'M'

    TOTAL += ITEM
    NEXT LOOP1

    PRINT "Score for " : M : " = " : (TOTAL / SAMPLES)

    NEXT M
  13. Standard memberroyalchicken
    CHAOS GHOST!!!
    Elsewhere
    Joined
    29 Nov '02
    Moves
    17317
    17 Jul '03 18:064 edits
    Right then...I guess I should explain. Here's how it works, as compared to what the program looks for. Let P(Best) be the probability that the best prancer candidate is selected; we seek to maximise this. Let p(i) denote the ith prancer candidate (i=1,2,....,N). Clearly:

    P(Best) = SUM (P[p(i) is best]*P[p(i) is selected])

    Where the sum is taken over i=M+1,M+2,...,N.

    Since for each i, P[p(i) is best] = 1/N, and the probability that p(i) is better than all its predecessors is M/(i-1), we have:

    P(Best) = (M/N)*SUM (1/(i-1)) for i = M+1,...,N

    or (and I'll only get into the details of this step if requested):

    P(Best) ~ (M/N)*(log N - log M)

    Where log denotes a natural logarithm. We need the value of M that maximises this, so we differentiate the above wrt M and set that equal to zero:

    log N - 1 = log M

    or exp(log N - 1) = M

    or M = N/e, where e = 2.7182818...

    Thus the best value of M is N/e, and the maximum probability of getting the best prancer is 1/e or about 37%. So, for example, if there re 100 potential prancers, 37 should be rejected out of hand and the probability of getting the absolute best prancer will be 37%.

    Therefore, Jon Tickle was not off by much. But can anyone tell me why this theoretical answer does not square with Chris's program.....? (There is a wee problemette, as iamatiger might put it.)



    EDIT Also, I'd be interested in hearing whether anyone can come up with a more efficient algortihm for this sort of thing. This was the best I could do, but I suspect that proving optimality here would be quite difficult.

    EDIT THE SECOND Chris, I like your profile quote.
  14. DonationChris
    Site Admin
    Wimbledon
    Joined
    21 Feb '01
    Moves
    26275
    18 Jul '03 09:573 edits
    Originally posted by royalchicken
    Right then...I guess I should explain. Here's how it works, as compared to what the program looks for. Let P(Best) be the probability that the best prancer candidate is selected; we seek to maximise this. Let p(i) denote the ith pr ...[text shortened]... te difficult.

    EDIT THE SECOND Chris, I like your profile quote.
    Here is the output from the program when it is adjusted to count the number of times the BEST item was selected. The number outside the brackets is the average quality (0-99) of the selected candidate. The number in the brackets is the number of times the best candidate of all 30 was selected. Each value of M had 100,000 runs. This shows that the value M=11 produces the highest number of "best" selections, but for average candidate quality, M=4 seems to be best. I'm confused.

    0 = 50.09 (3385)
    1 = 73.34 (13061)
    2 = 79.99 (19805)
    3 = 82.54 (24612)
    4 = 83.27 (28262)
    5 = 83.43 (31196)
    6 = 82.91 (33701)
    7 = 82.08 (35444)
    8 = 81.00 (36360)
    9 = 79.94 (37118)
    10 = 78.78 (37702)
    11 = 77.67 (38357)
    12 = 76.05 (37680)
    13 = 74.56 (37015)
    14 = 73.32 (36543)
    15 = 71.79 (35559)
    16 = 70.21 (34047)
    17 = 68.98 (32934)
    18 = 67.27 (31145)
    19 = 65.85 (29642)
    20 = 64.13 (27452)
    21 = 62.63 (25384)
    22 = 61.00 (22936)
    23 = 59.62 (20759)
    24 = 57.98 (18088)
    25 = 56.50 (15435)
    26 = 54.80 (12605)
    27 = 53.29 (9659)
    28 = 51.67 (6574)
    29 = 50.01 (3307)

    EQU QUALITY TO 15000
    EQU SAMPLES TO 100000
    EQU ITEMS TO 30

    FOR M = 0 TO ITEMS - 1

    TOTAL = 0
    GOTBESTCOUNT = 0

    * For each value of M, take a reasonable number of samples

    FOR LOOP1 = 1 TO SAMPLES
    ITEM = 0
    BEST = -1

    * Look at the first 'M' items and find the best

    FOR LOOP2 = 1 TO M
    ITEM = RND(QUALITY)
    IF ITEM > BEST THEN
    BEST = ITEM
    END
    NEXT LOOP2

    * Look at the remaining items and stop if any is better than BEST,
    * otherwise take the last item

    PICKEDPOS = -1

    FOR LOOP3 = M + 1 TO ITEMS
    ITEM = RND(QUALITY)
    IF ITEM > BEST THEN

    * Exit Loop

    PICKEDPOS = LOOP3
    LOOP3 = ITEMS + 1
    END
    NEXT LOOP3

    * Now, see if the picked item is the best by examining the remaining items.

    * Now, see if the picked item is the best by examining the remaining items.

    * If an item was found that was better than BEST then let's see if it is the
    * best of ALL the items

    IF PICKEDPOS > -1 THEN

    GOTBEST = 1
    FOR LOOP4 = PICKEDPOS + 1 TO ITEMS
    IF RND(QUALITY) > ITEM THEN
    GOTBEST = 0
    LOOP4 = ITEMS + 1
    END
    NEXT LOOP4

    GOTBESTCOUNT += GOTBEST

    END

    * Add the score of this item to the current total for 'M'

    TOTAL += ITEM
    NEXT LOOP1

    PRINT M : " = " : (TOTAL / SAMPLES / (QUALITY / 100)) 2 : " (" : GOTBESTCOUNT : &quot😉"

    NEXT M
  15. Standard memberroyalchicken
    CHAOS GHOST!!!
    Elsewhere
    Joined
    29 Nov '02
    Moves
    17317
    18 Jul '03 15:50
    Originally posted by Chrismo
    Here is the output from the program when it is adjusted to count the number of times the BEST item was selected. The number outside the brackets is the average quality (0-99) of the selected candidate. The number in the brackets is the number of times the best candidate of all 30 was selected. Each value of M had 100,000 runs. This shows that the value M ...[text shortened]... "best" selections, but for average candidate quality, M=4 seems to be best. I'm confused.

    You are correct 😀. The problem asks for the way to produce the highest probability of selecting the best candidate, which, as you say, happens at M=11. In practice, shooting for highest average quality seems better from a business standpoint, though.

    Can anyone work out, in view of what I said before, how to get the best average candidate quality in general (noting that Chris was correct in saying M=4 in that case)?
Back to Top

Cookies help us deliver our Services. By using our Services or clicking I agree, you agree to our use of cookies. Learn More.I Agree