Opinion
This murder case presents a narrow, but important, question regarding the admissibility of deoxyribonucleic acid (DNA) evidence to prove identity in criminal prosecutions. A DNA comparison of blood found at the crime scene with defendant’s blood resulted in a match. That is, defendant’s genetic profile matched that of the blood at the crime scene so that he could not be excluded as a donor of that blood. Similarly, a DNA comparison of blood found on the pants defendant was wearing when arrested resulted in a match with the victim’s blood so that the victim could not be excluded as a donor of that blood. Obviously, evidence tending to show that defendant’s blood was found at the crime scene, and that the victim’s blood was on defendant’s pants, would be highly probative to whether defendant was the killer.
When a match is found, the next question is the statistical significance of the match. Of course, a match is less significant if the blood could have come from many persons rather than from only a few. Experts calculate the odds or percentages—usually stated as one in some number—that a random person from the relevant population would have a similar match. The question here revolves around exactly what is the relevant population. The question is complicated by the fact that the odds vary with different racial and ethnic groups. Because of this variation, separate databases are maintained for different population groups, and the odds for each group are calculated *1240 separately. In this case, as in many cases, no evidence exists of the racial or ethnic identity of the perpetrator other than evidence indicating that defendant was the perpetrator. Over defense objection, the trial court permitted the prosecution to present evidence of the odds as to the three most common population groups in this country—Caucasians, African-Americans, and Hispanics. For example, the evidence showed that only one Caucasian in 96 billion would match the crime scene blood that matched defendant’s profile.
Defendant contends the court erred. Relying heavily on the opinions in
People v. Pizarro
(1992)
I. Facts and Procedural History
Around 6:15 p.m. on April 6, 2000, the body of 13-year-old Sarah Phillips was found on the living room floor of her Vacaville home. She had been strangled with a telephone cord, and her body had suffered multiple bruises, scrapes, and scratches. Her pants and panties had been removed, and her shirt was pushed up.
Defendant was arrested around 2:00 a.m. on the morning after the killing and charged with her murder. He had visited the victim’s house regularly while dating her older sister three years earlier. DNA evidence as well as other evidence implicated him as the perpetrator. The Court of Appeal summarized the non-DNA evidence: “[Defendant] aggressively propositioned several women before the assault on Sarah, showing interest in whether they lived alone; he admitted speaking with Sarah around the time of the killing when she was alone at her home, where the killing occurred; he was seen by witnesses in the area before the killing, without scratches, and after the killing, with scratches consistent with the struggle indicated by the crime scene evidence; and shortly after the murder he told a witness he had done something bad, which he could not ‘fix.’ ”
*1241 The prosecution also presented DNA evidence. Three kinds of DNA tests (D1S80, DQA1 polymarker, and STR) were performed on bloodstains found on the victim’s clothing and on defendant’s clothing when he was arrested. All of the tests matched defendant’s genetic profile to blood on the victim’s jeans, and the victim’s profile to blood on defendant’s pants. The STR testing also matched the victim to a hair found in defendant’s pants, and both the victim and defendant to blood found under the victim’s fingernail.
The STR test was the most sensitive. It compared nine genetic markers and included a marker for gender discrimination. Nicola Shea, a criminalist with the Sacramento laboratory of the California Department of Justice (Department), was the prosecution’s STR expert. She testified that, to help juries understand the significance of a DNA match, the Department followed the statistical approach recommended by a 1996 report of the National Research Council for presenting the frequency with which genetic profiles occur. (National Research Council, The Evaluation of Forensic DNA Evidence (1996) (hereafter 1996 NRC Report).) The Department used databases that the Federal Bureau of Investigation published in the Journal of Forensic Sciences reflecting profile frequencies in the Caucasian, Hispanic, and African-American populations, “because those are the major populations in our country and in our state.”
Shea testified she used all three databases to avoid making assumptions about the ethnic background of the perpetrator. Data for other groups, such as Native Americans, would also be compared if information had indicated another group might be a source of the evidence sample—for example, if the crime had occurred on an Indian reservation. She explained that “the same profile will show up with a different frequency in the different populations.” However, she also said that “the three populations given give you a ballpark of how often you would expect to see that profile in those populations. If something is extremely rare in those three populations, you might expect it for that many markers to be extremely rare in one of the other populations.” When nine genetic markers are used in the analysis, the result would be a “pretty discriminating number” no matter what population database was used.
Defendant’s genetic profile would be expected to occur in one of 96 billion Caucasians, one of 180 billion Hispanics, and one of 340 billion African-Americans. The victim’s genetic profile would be expected to occur in one of 110 trillion Hispanics, one of 140 trillion Caucasians, and one of 610 trillion African-Americans. Criminalist Shea noted that these profiles were extremely rare; the world contains only about six and a half to seven billion human beings.
*1242 Defendant objected to the introduction of these profile frequencies, arguing that the prosecution had failed to lay a foundation for this evidence because it did not establish the race of the persons who left the blood samples. The trial court disagreed and admitted the evidence. The jury found defendant guilty of first degree murder with use of a dangerous weapon during the commission of an attempted rape and a lewd act on a child.
Defendant appealed. The Court of Appeal found that the trial court properly admitted the DNA evidence and affirmed the judgment. We granted review to resolve a conflict between this opinion and
Pizarro II, supra,
II. Discussion
“DNA analysis ... is a process by which characteristics of a suspect’s genetic structure are identified, are compared with samples taken from a crime scene, and, if there is a match, are subjected to statistical analysis to determine the frequency with which they occur in the general population.”
(People v. Barney
(1992)
As the Court of Appeal in this case explained, “Profile frequencies within the major racial groups in the United States (Caucasian, African-American, Hispanic, East Asian, and Native American) vary to such an extent that separate DNA databases are maintained for the purpose of providing accurate estimates of profile frequency. (1996 NRC Rep., pp. 28, 57-58, 98, 151; see also
People v. Soto, supra,
The trial court permitted the prosecution to present evidence of profile frequency within the three most common populations in this state and country—Caucasian, Hispanic, and African-American. Defendant contends that because no independent evidence exists that the donor of the blood samples at the crime scene and on his pants belonged to any particular population group, the frequency of these groups is irrelevant. He relies primarily on the two
Pizarro
cases.
(Pizarro I, supra,
*1243
The
Pizarro
defendant was half Hispanic and half Caucasian. Because of this, the prosecution expert calculated the odds regarding the Hispanic and Caucasian population groups.
(Pizarro I, supra,
In part, the Pizarro opinions condemned presenting evidence of the odds solely regarding the defendant’s population group was the donor. (See Pizarro I, supra, 10 Cal.4th at pp. 93-94; Pizarro II, supra, at pp. 629-631 & fn. 79.) On this point, the court was on solid ground. As one recent commentator has explained, “One strangely persistent fallacy in the interpretation of DNA evidence is that the relevant ethnic or racial population in which to estimate a DNA profile frequency necessarily is that of the defendant. The issue has been cogently analyzed, and it should be clear that the relevant population is the entire class of plausible perpetrators.” (Kaye, Logical Relevance: Problems with the Reference Population and DNA Mixtures in People v. Pizarro (2004) 3 Law, Probability & Risk 211, 211 (hereafter Kaye 1 ); see also the authority collected in Pizarro II, supra, at pp. 629-630, fn. 79.) Accordingly, we agree with the Pizarro opinions that a trial court should not admit evidence of the odds solely regarding the defendant’s population group. Similarly, when the match involves the victim, the court should not admit evidence of the odds solely regarding the victim’s population group.
As the Court of Appeal explained, this case does not present this problem: “[Tjhere is no suggestion in the record that the databases used by the DNA expert for calculating profile frequencies were chosen based on Wilson’s or Sarah’s racial background. Witnesses described Wilson as a light-skinned Black man. The police report described Sarah as White. Shea testified that she followed standard practice of determining the frequency of the matched profiles using Caucasian, Hispanic, and African-American databases, in order to avoid making assumptions about the ethnic background of the perpetrator *1244 or the victim. (Shea misspoke in reference to the ‘victim,’ whose ethnic identity was known; it was the ethnicity of the person who left the bloodstains and the hair on Wilson’s pants, which matched Sarah’s profile, that was in question.)”
But the court in
Pizarro II
went on to say that, absent independent evidence of the population group to which the perpetrator belonged,
any
evidence regarding
any
particular group was irrelevant and hence inadmissible. “[I]n the absence of sufficient evidence of the perpetrator’s ethnicity,
any
particular ethnic frequency is irrelevant. The problem is . . . one of preliminary fact .... It does not matter how many Hispanics, Caucasians, Blacks, or Native Americans resemble the perpetrator if the perpetrator is actually Asian. If various ethnic frequencies are presented to the jury, each will have been admitted without adequate foundation.”
(Pizarro II, supra,
This latter conclusion is very significant. Professor Kaye summarized its potential impact. “[T]he Pizarro court announced that giving a range of frequencies for the major racial or ethnic groups in the United States is unacceptable. Since providing statistics from several racial groups is the standard way of assessing the significance of a match in cases in which the racial and ethnic status of the perpetrator of the crime initially is unknown, the opinion casts doubt on the outcomes of innumerable cases.” (Kaye, supra, 3 Law, Probability & Risk at p. 214.)
The Court of Appeal here disagreed with this portion of Pizarro II. “We believe the Pizarro court’s insistence that the database used to calculate the profile frequency must be drawn from the perpetrator’s racial group was misplaced. The random-match probability is meant to measure the rarity of the genetic profile detected in the evidence sample and in the defendant by estimating the frequency with which it occurs in the population of possible suspects. As explained in the 1996 NRC Report;
“ ‘Suppose that a DNA sample from a crime scene and one from a suspect are compared, and the two profiles match at every locus tested. Either the suspect left the DNA or someone else did. We want to evaluate the probability of finding this profile in the “someone else” case. That person is assumed *1245 to be a random member of the population of possible suspects. So we calculate the frequency of the profile in the most relevant population or populations. The frequency can be called the random-match probability, and it can be regarded as an estimate of the answer to the question: What is the probability that a person other than the suspect, randomly selected from the population, will have this profile? The smaller that probability, the greater the likelihood that the two DNA samples came from the same person.’ (1996 NRC Rep., p. 127, italics added.)
“The population of possible suspects frequently includes a range of ‘potential perpetrators,’ whose numbers and race depend on what is known about the circumstances of the crime. When the perpetrator’s race is unknown, the frequencies with which the matched profile occurs in various racial groups to which the perpetrator might belong are relevant for the purpose of ascertaining the rarity of the profile.”
We agree with the Court of Appeal. Relevant evidence is evidence “having any tendency in reason to prove or disprove any disputed fact that is of consequence to the determination of the action.” (Evid. Code, § 210.) “ ‘The test of relevance is whether the evidence tends, “logically, naturally, and by reasonable inference” to establish material facts such as identity, intent, or motive.’ ”
(People v. Harris
(2005)
*1246
This precise issue was not before us in
People v. Soto, supra,
We also disagree with the Pizarro court’s conclusion that if the proponent of the evidence presents evidence sufficient for the trier of fact to find, by a preponderance of the evidence, that the perpetrator belonged to a particular population group, probability evidence as to that group, but no other, is relevant and admissible. Excluding probability evidence about any but the most likely group could deprive the jury of potentially crucial evidence. If, for example, the jury believed it 51 percent likely the perpetrator was Caucasian, providing it with the probability only for Caucasians would leave it uninformed regarding the 49 percent possibility the perpetrator was of some other population group. Professor Kaye explains: “[W]e can never know the perpetrator’s ancestry to a certainty. Suppose that the victim had lived and identified the man who assaulted her as ‘Asian.’ The cross-racial identification, made under great stress, could be mistaken. If an Asian defendant wished to argue that the actual rapist might have been Hispanic, it would assist the jury to know whether the DNA types are extremely rare among Hispanics. Indeed, in the limit, if no Hispanic had the requisite genotype, the Hispanic data would refute defendant’s alternative hypothesis. Conversely, if every Hispanic had the incriminating genotype and hardly any Asians did, the DNA evidence would be much less probative of the Asian defendant’s guilt than the Asian-only figure would suggest. To withhold the Hispanic data on the ground that it is irrelevant because the other evidence in the case points to an Asian could be a serious mistake.” (Kaye, supra, 3 Law, Probability & Risk at p. 215.)
If a defendant wanted to argue that the perpetrator might have been a member of a particular population group for which the odds were more favorable to the defense, surely it would be relevant and permissible to admit evidence of those odds. Similarly, the prosecution should be permitted to present evidence of a representative range of groups.
*1247 “Moreover, even if we could know that the perpetrator was, let us say, Asian, with sufficient certainty to exclude all other possibilities from rational discourse, the [Pizarro] court’s logic might lead to the paradoxical conclusion that the frequency data for Asians also is irrelevant. Asian-Americans, after all, are not a homogeneous group. There are many subgroups—Chinese, Indonesian, Japanese, and Korean, to name a few—and each subgroup can be parsed still more finely.” (Kaye, supra, 3 Law, Probability & Risk at p. 215.)
The
Pizarro
court and, in turn, defendant, present three objections to giving the jury the probabilities of a representative range of population groups, none persuasive. First, the court believed that, “in the absence of sufficient evidence of the perpetrator’s ethnicity,
any
particular ethnic frequency is irrelevant.”
(Pizarro II, supra,
Second, the court believed that “the improper mention of ethnicity unfairly and unjustifiably encourages the jurors to focus on ethnicity and race— specifically the ethnicity and race of the defendant, the only suspect before them.”
(Pizarro II, supra,
As Professor Kaye observes, “there is no obvious reason to think that telling the jury that the incriminating genotypes occur infrequently in every *1248 major racial or ethnic group will encourage the jurors to assume that the crime must have been committed by a member of the defendant’s racial or ethnic group. In fact, giving a range of statistics could discourage the jury from jumping to an unjustified conclusion.” (Kaye, supra, 3 Law, Probability & Risk at p. 216.)
Third, the
Pizarro
court believed that if the expert testifies to a range of probabilities, “the jury hears unjustifiably damaging evidence because the various ethnic frequencies create a range extending from the most conservative and beneficial to the defendant to the most rare and damning to the defendant.”
(Pizarro II, supra,
We also agree with Professor Kaye that “because more complete—but unattainable—knowledge [i.e., certainty as to the perpetrator’s race or ethnicity] would eliminate some alternatives does not make it wrong to contemplate those alternatives, and there is no inexorable prejudice in providing a range of applicable figures. [][]... The limitations in the databases can be made clear to the jurors, who are in a better position to evaluate the possibility that the match is coincidental if they have a range of the possible frequencies that are consistent with the facts of the case. As such, the exclusionary rule announced in Pizarro is baseless and counterproductive.” (Kaye, supra, 3 Law, Probability & Risk at pp. 216-217, fns. omitted.)
Moreover, as Justice Poliak noted, “as the science underlying DNA comparisons continues to improve, the practical significance of the different racial frequencies diminishes.” In
People
v.
Barney, supra,
Pizarro II
suggested three alternatives to presenting the jury with a range of probabilities: “(1) establish that the perpetrator more likely than not belongs to a particular ethnic population, then present only the frequency in that particular ethnic population; (2) present only the most conservative frequency, without mention of ethnicity; or (3) present the frequency in the general, nonethnic population.”
(Pizarro II, supra,
Defendant also argues that even if we reject the Pizarro approach, the evidence here was still improperly admitted because the expert gave the frequency range for only the three most common population groups, rather than all possible groups to which the perpetrator could belong. For example, he argues that “the geographical area near the crime scene in Vacaville included significant numbers of Asians, Pacific Islanders, and Native Americans,” and thus the range of frequencies should at least have included these groups. He notes that the 1996 NRC Report’s recommendation 4.1 states that “if the race [of the person who left the evidence-sample DNA] is not known, calculations for all racial groups to which possible suspects belong should be made.” (1996 NRC Rep., supra, at p. 122, italics added.) We do not believe the National Research Council meant that giving a range of probabilities is impermissible unless the range includes literally all possible population groups. For example, in its comment to recommendation 4.1 in the executive summary, the report suggests that if the race of the perpetrator is unknown, the results could be “given for data on whites, blacks, Hispanics, and east Asians.” (Id. at p. 5.) These four groups would not necessarily include all possible groups.
*1250 Although giving results for all possible population groups would be permissible, doing so is not required to give relevance to the range of possibilities. Furthermore, it is not clear whether it is realistically feasible to include all population groups. In the Soto case, for example, databases existed for separate subgroups of the Asian population group. The expert witness calculated the probabilities only for the Vietnamese subgroup. (People v. Soto, supra, 21 Cal.4th at pp. 531-532 & fn. 25.) It is not clear whether the expert could feasibly have provided information about every Asian subgroup or about Asians as a whole, not to mention all other groups or subgroups in the general population. In this case, Criminalist Shea provided information regarding the three most numerous population groups. This made her testimony relevant and admissible.
Of course, defendant was entitled to cross-examine the witness regarding other possible population groups, as he did in this case. When he did, the witness testified that the frequency of other population groups would be comparably small. Moreover, if defendant believed the perpetrator could have been a member of another population group or groups for which the frequency figures would be more favorable to him, he was entitled to cross-examine the witness or present his own evidence in that regard. The fact that defendants might proceed in either fashion does not make the evidence the prosecution presented irrelevant.
III. Conclusion
We agree with Justice Poliak’s summation: “Thus, there is no cogent reason to preclude testimony of a range of ethnic or racial genetic profile frequencies when the race of the perpetrator is unknown, so long as the data is not presented in a manner that assumes that the race of the perpetrator is the same as the race of the defendant. Since the testimony in the present case made no such assumptions, it was relevant, nonprejudicial, and properly received . . . .” (Fn. omitted.)
Accordingly, we affirm the judgment of the Court of Appeal. We approve
Pizarro I, supra,
George, C. J., Kennard, J., Baxter, J., Werdegar, J., Moreno, J., and Haerle, J., * concurred.
Notes
Professor Kaye was a member of the Committee on DNA Forensic Science that helped prepare the 1996 NRC Report. We have treated that report as authoritative.
(People
v.
Soto, supra,
Associate Justice, Court of Appeal, First Appellate District, Division Two, assigned by the Chief Justice pursuant to article VI, section 6 of the California Constitution.
