2

A) cosmic rays

D) microwave ovens

F) rocks and soil

B) radon

E) elements inside your body (potassium)

G) gamma rays from processed foods

C) hairdryers and other

household appliances

2. (4pts) MTF Bruce Ames has argued that the rodent model of carcinogenesis (cancer testing of chemicals using rodents) may have serious flaws. That challenge is based on a possibly faulty

A) extrapolation across species (rodents to humans)

B) extrapolation across doses

C) extrapolation across related hazards, or

D) Ames’s challenge is not related to an extrapolation at all

3. (4pts) Which examples/options fit either an accelerating or threshold model of extrapolation? In this question, you are simply deciding if the extrapolation is greater than linear in some fashion. MTF

A) Considering your accident rate when driving after no drinks as a baseline, you are 4X as likely to have an accident while driving after two drinks as after one drink

B) If AT&T can save $2 billion by laying off 20% of its workers, it can save $10 billion by laying off 100%

C) Half the LD50 dose of radiation kills less than 25% of animals (the LD50 kills half).

D) A company needs to sell 1,000 products to break even (no profit) but the profit per product increases with each additional product sold.

4 (4pts). MTF The advice given to Austin residents (circa 1985) to avoid eating more than one fish a week from Town Lake was based on a concern that fish had high doses of chlordane levels (chlordane was a pesticide used in termite treatment). The advice that “eating one fish per week is OK but eating 2 or more per week is not OK” involves

A) extrapolation across species (fish to humans)

B) extrapolation across doses

C) extrapolation across related hazards

D) a linear extrapolation

Errors in Data

5. (4pts) Winning rates at the slot machines at two different casinos are compared for a single day. There is some difference in the average rate between the casinos. Data are then collected for a month of winnings, and the difference between the two casinos is greatly reduced. What type of error underlies most of the differences in daily winning rates between the casinos? One answer only. Sampling error is indicated by the fact that the differences ‘are greatly’ reduced as more data were collected.

A) Sampling B) Bias C) RPA D) Human and technical E) None

6. (4pts) The company running the Lotto advertises winners to convey the impression that anyone can be a winner too. However, only winners are shown in these ads, even though winners comprise a tiny fraction of all players. Thus the difference in the incidence of winners shown in the ad is consistently much higher than the real incidence of winners. Assuming that you can treat this difference as a type of error, what type of error is this? One answer only. Bias is indicated by the ‘consistently much higher than the real incidence.’ You can also infer bias from the goal ‘to convey the impression that anyone can be a winner too.’

A) Sampling B) Bias C) RPA D) Human and technical E) None

7. (4pts) You are asked to guess the number of pennies in a large jar for a prize, the only stipulation being that you cannot directly count them in any fashion. Using a scale whose accuracy has been confirmed with standard weights, you weigh the entire bulk of pennies at 4236.74 grams and you weigh a single penny at 2.35 grams (your scale only goes to 2 decimal places). Dividing, you arrive at 1802.86 pennies in the jar, and you round it up to 1803. However, the actual number of pennies is actually 1800, and you don’t win the prize. What type of error is responsible for your miscount? One answer only RPA error is indicated by the use of a small number of decimal places to divide a small number into a large number and get a result that is only approximate.

A) Sampling B) Bias C) RPA D) Human and technical E) None

8. (4pts) Which in the following list constitute(s) human and technical error? MTF

A) sample mixup Yes, as the only possible source of this problem is humans

B) lab protocols fail to work as claimed Yes, since the problem lies at a technical level

C) outliers of natural variation this was given as a form of sampling error

D) failure to follow protocol Yes, as the only possible source of this problem is humans

E) deliberate contamination of a sample No, this is a form of bias

Ideal Data (Fixes and error)

9. (4 pts) Which of the following options correctly describe how a classroom demonstration was used to illustrate properties of error? MTF

A) coin flip to illustrate sampling error and how to reduce sampling error Yes for both. How to reduce sampling error followed from the pooling of coin flips for the entire class to get a much narrower distribution of P(heads)

B) “choose a random odd number” to illustrate unintentional bias Yes, since the distribution of numbers was far from random (7 was chosen way too often, etc.)

C) width of a dime to illustrate human and technical error No this was RPA error because the device only showed 2 decimal places

D) memory test to illustrate accuracy (as a type of RPA error) No, there was no such ‘memory test’

(10-11). For each of the following statements, mark the appropriate letters that describe the data design features present. Mark a data feature only if it is explicitly present at some level in the problem description.

10. (4pts). You decide to test whether sober people can routinely pass the SFST, and whether age affects performance. You recruit 200 people of different ages and inform them only that they will be given the SFST, they must be sober at the time (verified with a breathalyzer test that is calibrated against a blank), and that you are interested in whether men are better than women at passing the test; they are not told about your interest in the effect of age. They are asked to show up in alphabetical order on the same day. The test is administered by officers in uniform that are certified to administer the test and who follow formal test procedures, the actual trials are video taped and verified by others who are also certified. MTF

The problem includes a fairly elaborate description of protocol (A); ‘200’ indicates replication (B); ‘against a blank’ implies standards (C); there is no mention of random (not D); ‘not told’ implies blind (E)

(A) explicit protocol	(C) standards	(E) blind
(B) replication	(D) random	(F) none

11. (4 pts) A police group decides to determine how gullible the public is to scams. Hundreds of people are sent invitations to seminars about how to obtain government subsidies (“free money”) for home improvement; of course, no mention is made in this invitation about the real purpose. Each person is invited to only one seminar, but different people are invited to different seminars, so that the police group can try different scamming methods and see which are the most effective. Approximately 50% of the invitees attend their seminar, where they are encouraged to provide the seminar organizers with social security numbers, bank account numbers, and other data that a con artist could use. The attendees are then informed of their vulnerability to scams. Which features are indicated? MTF

The problem includes a fairly elaborate description of protocol (A); ‘Hundreds’ indicates replication (B); there is no mention of standards but there is a description of controls (‘invited to different seminars, so that the police can try different scamming methods’) so controls was not graded; there is no mention of random (not D); ‘no mention’ implies blind (E)

(A) explicit protocol	(C) standards	(E) blind
(B) replication	(D) random	(F) none

12. (4pts) MTF A standard (measurement control) to evaluate whether a DNA typing lab is making mistakes could consist of which of the following. Assume that the DNA type (barcode) of the sample is unknown to you unless indicated otherwise. “Coded” means that a number is attached to the sample but without the name of the person whose DNA it is; “labeled” means that the sample is labeled in some fashion which may or may not be coded. Assume that you are the one sending the standards to the lab for testing. You want to know if the results could possibly tell you if a mistake has been made without further testing on your part. A standard in this case is

A) a sample whose DNA type/barcode is known to you in advance Yes, since you know the DNA type independently of the lab’s analysis

B) a coded sample of DNA No, coding makes it blind but does not indicate that you also know the DNA type

C) two samples of the same DNA that you have labeled differently but you know are the same Yes, since the lab’s failure to assign them the same DNA type would indicate lab error

D) any labeled sample of DNA No, labeling the sample does not indicate its type (just as coding the same in option B doesn’t either)

E) any replicated sample of DNA (two samples from the same person) Yes, for the same reason as in C

13. (4pts) Don polled two groups of people for their opinions on the Iraq war. One group consisted of 200 UT students passing by the Student Union one day; the other group consisted of 100 people in a much larger audience attending a Baptist sermon on Sunday. Attitudes were substantially different between the two groups. Don wants to rule out the possibility of bias instead of sampling error as the cause of the difference. What options describe reasonable ways to decide if his observed differences are due to sampling error rather than bias? MTF

A) Obtain a third sample different from either of the first two groups (from non-students, non-churchgoers). If that third sample matches either of his first two samples, or if it falls between them, then he can be confident that the difference among his first two is sampling error. No, since this third group does not give you any information about either of the first two groups.

B) Obtain much larger samples from the two original groups (a larger sample of UT students, a larger sample of people attending the same church); if the difference persists, he can be confident that the difference is not sampling error. Yes, increasing the sample size is the standard way of reducing sampling error

C) Perform a statistical analysis of the data; a statistical analysis will tell him whether the differences are consistent with sampling error. Yes, this is the standard approach to testing whether sampling error can account for the differences.

D) Repeat the surveys with new people at the same locations but using a different set of questions. If the original difference was due to bias rather than sampling error, the bias should go away when the questions are changed. No, for the same reason as in A – the new groups give you no information about the original groups.

14. (4pts) For a technique used to declare a match between a forensic sample and a suspect, such as DNA typing, fingerprinting, or hair matching, what is the consequence of not having a reference database from the population? MTF

A) Without a reference database, it is not possible to conduct proficiency tests of lab error rates. No, proficiency tests tell you if the lab is correct in its assays; the do not require a reference database

B) Without a reference database, it is not possible to calculate a RMP (random match probability) Yes, a reference database gives you the frequency of different characteristics in the population, hence the RMP

C) Without a reference database, it is not possible to detect sample mixup No, sample mixup has nothing to do with reference databases

D) Without a reference database, there is no benefit of blind procedures. No, for similar reasons as in (A). Blind reduces bias, which has nothing to do with ref. databases

15. (4pts) Which options identify a “fix” for the type of error indicated; a “fix” may either reduce that error or at least allow you to detect/measure that error. MTF

A) error: sample mixup. Fix: code tubes blindly No, this won’t help detect or fix mislabeling and other sources of sample mixups

B) error: unintentional failure to follow protocol because it is difficult to understand. Fix: design a protocol that is easier to understand but achieves the same objectives Yes, like you really need this course to answer this one.

C) error: lab fails to conduct analyses carefully and fails to check results because they know which samples belong to the suspect and know what results are consistent with suspect being guilty. Fix: code samples so that lab does not know which belong to suspect. Yes, this fix is a form of blind, which is one of the fixes for bias. The error is an obvious description of bias

D) error: lab occasionally declares false matches, but they often go undetected. Fix: blind proficiency tests to measure the error rate. Yes, this doesn’t fix the problem but measures the extent of it.

(16, 17). Do-it-yourself protocol. You are conducting an external review/test of a genotyping lab. Your job is to send two tubes to the lab, with labels. There are several options for the content of and label on a tube. You must decide which contents to send and how to label the tubes so that the features of ideal data requested in the question are present from the lab's perspective. If a tube has a person's name on it, the lab can assume that the tube contents belong to the name of the person on the label. If a tube is labeled with a number, the contents are unknown to the lab but known to you. New: a parentheses ( ) around a blood type, marker or gender indicates that you do not know the individual’s status for that characteristic. Your options for tube contents and tube labels are:

option	tube label	Contents in the tube are from	Blood type	Gender	Marker status
(A)	Laura Baker	Laura Baker	B	Female	+
(B)	Darin Rokyta	Darin Rokyta	(AB)	(Male)	(negative)
(C)	Rachael Springman	Rachael Springman	O	Female	+
(D)	#132	Darin Rokyta	(AB)	(Male)	(negative)
(E)	#218	Patsy Cline	(A)	Female	(+)
(F)	#10	Pam Hines	O	Female	negative
(G)	Jerry Allison	Jerry Allison	B	Male	negative
(H)	#101	Brent Iverson	AB	Male	negative
(I)	No combination of tubes can satisfy the protocol

In the following questions, choose two letters among options (A)-(H) to describe the two tubes that will be sent to the lab. The tube labels are the only information the lab receives about the samples, and the lab does not have prior information about the individuals. If it is possible to satisfy the protocol, the question will require exactly two letters and only two letters -- one for each tube. Thus, the answer for a question might be (A) & (B), or it might be (D) & (F). If more than one pair of options are possible correct answers, fill in only one correct pair of options. Thus, if (A) & (B) is one acceptable answer, and (C) & (D) is another acceptable answer, fill in either (A)&(B) or (C)&(D), but not both. If a factor (such as identity, blood type, gender, etc.) is not specified in the protocol, then that factor will be ignored in grading the answer.

Alternatively, if a protocol cannot be satisfied with two from (A)-(H), fill in (I).

16. (3 pts) Choose two tubes to achieve replication of gender but nothing else. You should know both that gender is replicated and that nothing else is replicated, and the replication should be blind to the lab (you can assume the lab will know gender from the name on the tube).

two tubes or I: (A) (B) (C) (D) (E) (F) (G) (H) (I)

You basically go down the list, looking for two samples that are female or two that are male. Then look to see that their blood type and marker status are both different. None of these 3 characteristics can be in parens (), since the problem states you should know that gender is replicated but nothing else is. You are left with A&F

17. (3 pts) Make the tubes replicated for marker, gender and blood type, but all replication is blind to the lab. You should know that the replication is present, even if you don’t know the marker, gender, and blood type.

two tubes or I: (A) (B) (C) (D) (E) (F) (G) (H) (I)

Making the replication blind to the lab means that at least one of the samples must be coded, since the problem states that they can infer gender from the name. Two different people whose blood type, gender, and marker status are known to be all the same (hence no parens) could potentially satisfy this, but the problem states that you should know that the replication is present even if you don’t know what the blood types (etc.) are. Thus, if the samples are from the same person, you know everything is replicated. B&D are the only two that work.

18 (5 pts). The following pair of graphs was shown in relation to the coin flip demo in class. Which points were illustrated by either or both graphs? The horizontal axis is the proportion heads, and both horizontal axes span 0 to 1. MTF

(A) There is greater bias in the left graph, because the left shows that more people failed to get the right proportion of heads. No. There is no bias in the left graph. You could not argue a bias without specifying what the true value is. In class we suggested that 50% heads was expected, but there is no evidence of a systematic deviation from that in either graph.

(B) Classes from different years have generated different distributions of the proportion of heads Graded either way, because the right graph supports this point; had the option specified the left graph (10 flips per observation), it would have been wrong

(D) The right graph has the least RPA error. No RPA error in this demo.

(E) With 10 flips, most of the class failed to get within 5% of the expected frequency (50% heads) Yes, as mentioned in class and can be seen in the graph

(F) Pooling the data from the entire class (and previous classes) consistently yielded results close to 50% heads (within 5% or so) Yes, from the right graph

(G) Bias and sampling error can affect the same data No, bias not part of this demo

Drug Testing, DWI testing

19. ( 4 pts). What constitutes a standard in a drug test for evaluating lab error rates? (MTF)

A) A sample with a known level of drug present. Obviously yes

B) A sample known to be drug-free. Yes, since a test result showing a positive level would be known to be in error

C) A written procedure describing the level of performance to be upheld by the lab No, this option is an attempt to see if someone thought a standard was the same as ‘standards to be upheld’

D) Any measure taken by the lab to detect or reduce human and technical error No, since changing the protocol could be a measure to reduce human and technical error that did not involve standards.

E) A proficiency test given to the lab that does the analysis, regardless of whether the test is blind. Yes, proficiency tests were given as a form of standards (technically, they are tests using standards, but we won’t split hairs).

20. (4pts) Which protocol features are not needed or important when drug testing (e.g., to test for the presence of cocaine) but are needed or important for DNA typing and determining the significance of a match? Options are correct if they identify something is specifically useful only for DNA type matching. MTF This question aims to get you thinking about ideal data and where different features apply. Ideal data principles apply to virtually all procedures (yes, blind is only relevant some times, etc...). But one big difference between DNA and drug testing is hinted at in the question: the significance of a match. For this, DNA typing needs a reference database and a method for calculating the RMP. Drug testing does not. Thus options D and F are the only ones specific to DNA.

A) replication

B) a knowledge of lab error rates

C) standards in the form of a sample of known properties

D) methods to calculate a RMP

E) blind processing of samples

F) a reference database from the human population

21. (4pts) What described in class or book constitutes a form of replication in DWI testing? To be correct, the option must both describe replication and be something that is actually done or used. MTF

A) multiple air blanks in the breathalyzer test Yes, as we showed, multiple blanks are used (and obviously constitute a form of replication of the testing process).

B) multiple breath samples from the suspect Yes, for similar reasons as in (A)

C) an air blank plus the breath sample from the suspect No. both data are taken but do not constitute replication

D) multiple tests used to assess SFST performance Yes, multiple tests are used and constitute replication of performance

E) a sample of known alcohol content tested by the breathalyzer Hmmm. This sounds like a standard, not replication.

F) an explicit protocol (formal procedure) for giving SFST instructions to the suspect Another give-away that should be left blank.

DNA Typing plus Criminal Justice System

22 (4 pts). The goal is to determine lab human and technical error rates of DNA typing through replicated typing of the same individuals. Ten tubes whose DNA type is unknown are sent to the lab. These 10 samples are from 8 people, but the lab does not know the number of people. Different tubes from the same person have different codes. For this goal, is the replication blind to the lab performing the typing? Why or why not? MTF

Blind means that the person being assayed or the person performing the assay is unaware of some critical feature of the design, relative to what they are doing or is being done to them. The goal is to determine error rates of the lab, so to be blind, the lab could be unaware they were being tested or be unaware of the sample identities (i.e., which samples are the same or what the DNA type is for the samples they are testing). The fact that they can use their results to figure out which samples are the same is irrelevant (option B), since if they make a mistake they won’t be able to figure it out. Although it may be true that the lab knows some details, the test is blind as long as the lab cannot figure out its mistakes before sending off the results, so (C) is wrong. (D) is just out to lunch. So the only correct option is (A).

A) It is blind, because the lab does not know which tubes belong to the same person.

B) It is not blind regardless of the labels, because the lab can figure out which samples are the same after they do the DNA typing.

C) It is ambiguous as to whether the procedure is blind, because there are ways in which the procedure is blind and other ways in which it is not.

D) It is not blind because there is no standard sent with the tubes.

23-25 (3 pts each). Fill in the blanks of the “consequences” column of the table with the best option(s) from the list below the table. The question number is given in each blank.

Deviation from Ideal Data	Consequences
lack of replication	23. one answer only
samples not processed blindly	24. MTF
Inadequate protocols for analysis of results	25. one answer only

Your choices for consequences are:

(A) Improper calculation of RMP (random match probability) in some cases This one pretty obviously ties to 25, which addresses ‘protocols for analysis of results’

(B) Sample mix-ups in a case can go undetected with replicated testing of the same samples, mixups could be detected. So this is one possible consequence of lack of replication, hence the choice for #23.

(C) The RMP threshold for conviction will appear to be exceeded when it is actually not exceeded There is no ‘threshold for conviction’ so this option is not to be used.

(D) Allows deliberate contamination of sample An obvious possibility when samples are not blind. (should be chosen for #24)

(E) This protocol is no longer applicable because of recent changes in DNA typing methods This option doesn’t apply to the question, which is asking for consequences of deviations from ideal data.

(F) The protocol allows for a biased willingness to accept results The ‘biased’ points toward lack of blind (should be chosen for #24)

(G) Selective reinforcement of the prime suspect Selective reinforcement was described as a process stemming from the lack of blind in prosecution protocols. It thus applies to 24, but perhaps could have been worded more directly toward the question.

(H) The protocol increases the likelihood of sample mix-up Lack of blind hopefully does not change the likelihood of sloppy procedure.

(I) The full extent of lab error rates remains unknown knowing error rates have to do with proficiency tests, not whether samples are processed blindly

(J) This protocol allows outliers of natural variation to escape detection this is a failure to detect sample extremes, which has nothing to do with blind.

26. (4pts) If the lab declares a match between a suspect and a crime scene sample, and there is a very small probability of a random match (e.g., 1 in a billion) but a much larger probability that the lab made a mistake and falsely declared a match (e.g., 1%), what can be said about the odds that the suspect does not match the sample? one answer

We went over this in class. We want to know the chance that a declared match means something other than the sample came from the suspect. One way this can happen is a random match. Another way is that the match has been wrongly declared (lab error). Since they are different ways to get a match that isn’t real, we have to combine them. The chance that the match is not real is ALWAYS at least as large as the larger of these two values (lab error rate and RMP). When they are both small, you should approximately add them together to get the combined significance.

A) The odds are still close to 1/billion because there is a 99% chance that the lab did NOT make a mistake, so 0.99x1/billion is still close to 1/billion.

B) The odds are close to 1%, because 1% of the time the suspect will not match even though the lab indicates a match, and 1/billion times the match will be because of a random match. 1% + 1/billion is close to 1%.

C) The odds are 1/billion divided by 1%, hence 1/(10 million)

D) The odds cannot be calculated when two types of error are present.

27. (4pts) Considering the last two decades, which forensic methods have been shown (in proficiency tests) to have error rates sometimes exceeding 10% or to otherwise be unreliable? MTF

These were gone over in class; all but DNA typing have had high error rates. The fingerprint studies were done in the 1990s, and it is possible that they are better now, but the problem is about methods that ‘have been shown to have high error rates,’ so fingerprints still apply.

A) polygraph (lie detector)	D) hair matching (not DNA based)
B) fingerprint matching	E) eyewitness identification
C) DNA typing

28. (4 pts.) Exam Key Code: Fill in (AB) on question 28 to indicate your exam code.

Also, fill in the correct bubbles for your name and pad number on the scantron form.

You must turn in this hard copy (with your name on it) and your scantron to receive credit for this exam.

Blood type

DNA Typing plus Criminal Justice System