94 Misc. 2d 704 | N.Y. Sup. Ct. | 1978
OPINION OF THE COURT
The defendants herein are charged with attempted grand larceny in the first degree by extortion and attempted grand larceny in the third degree. During the course of the investí
The People therefore have made a motion for a pretrial ruling by the court that voice identification by spectrographic analysis has reached the standard of scientific acceptance and reliability to warrant its admission into evidence.
The court has held an extensive hearing on this subject at which time highly qualified experts, in the field of spectrography, namely, Dr. Henry Truby and Mr. Frederick Lundgren, testified for the People, and Dr. Louis Gerstman for the defense. Both the People and the defense introduced a number of exhibits into evidence, consisting of documents, treatises, studies and articles dealing with this subject.
Briefly, a sound spectrograph is a device which converts sound waves into electrical signals. These signals are transduced into mechanical energy which operates a stylus. The stylus "burns” a visual pattern onto a paper. This pattern is known as a sound spectrogram.
This spectrogram is commonly called a "voiceprint”.
The visual pattern displayed on the spectrogram is an analogue of the frequency, intensity, and duration of the sound waves introduced into the spectrograph.
The sound spectrograph was originally developed at Bell Telephone Laboratories more than 30 years ago, and was initially used as an aid in studying human speech and the formation of words and syllables.
The machine was soon adapted for use in voice identification and elimination, during which procedure an examiner compares two spectrograms, and makes a judgment based on visual similarities or differences as to whether the voices which produced those spectrograms belong to the same person.
Since speech and voice are determined largely by the size, shape, and relationship of the tongue, lips, and various vocal cavities and articulators, it is assumed that no two voices are exactly alike, as it is highly unlikely that any two persons have identical relationships between these articulators.
It is presumed that if this technique is proven to be reliable and a generally accepted scientific process, that the basic inferences could be made with assurance, and that these test results would be admitted into evidence.
STANDARDS OF ADMISSIBILITY
The first question which the court must decide is the nature of the standards which must be applied in the admissibility of this technique.
In People v Leone (25 NY2d 511, 517), in deciding the admissibility of lie detector tests, the court said: "Although perfection in test results is not a prerequisite to the admissibility of evidence obtainable by the use of scientific instruments, the rule has been to grant judicial recognition only after the instrument has been sufficiently established to have gained general acceptance in the particular field to which it belongs.”
In applying this standard to the polygraph, the court said (p 518): "We are all aware of the tremendous weight which such tests would necessarily have in the minds of a jury. Thus, we should be most careful in admitting into evidence the results of such tests unless their reasonable accuracy and general scientific acceptance are clearly recognized.” (Emphasis supplied.)
This court therefore concludes that the standard which must be applied to the admissibility of spectrographic analysis, or indeed of any scientific test, is the twofold test of reliability and general scientific acceptance.
It has been suggested that the test of general acceptance of the scientific community (commonly called the Frye test) as enunciated in Frye v United States (293 F 1013) should be the only test applied. For example, in United States v Brown (No. 34383-72, DC Super Ct, 1973) the court determined that reliability is the basis upon which a new procedure is generally accepted by the scientific community, and that the court
We reject that proposition on several grounds. First, we interpret People v Leone (supra) as mandating the two separate tests. Second, an analysis of cases dealing with the admissibility of spectrographic voice identification reveals that in many instances the courts have dealt separately with the two standards, sometimes basing their decisions on the standard of reliability in derogation of the standard of scientific accpetance.
In United States v Bailer (519 F2d 463, 466) the court said: "The evidence presented in an extensive voir dire demonstrated spectrography’s probative value, despite doubts within the scientific community about its absolute accuracy.”
That court also said (p 465), "The admissibility of spectrographic identification turns primarily on whether this theory has been sufficiently proved”. After citing a study conducted by one Oscar Tosi (infra), and its conclusion that there would be an error in identification of 6%, the court appears to have adopted Tosi’s inference that additional refinements would reduce the error to about 2%.
The adoption of this conclusion must necessarily have involved an independent judgment by the court as to the reliability of the technique, since it is based on an inference and appears not to be based on acceptance by the scientific community.
In Commonwealth v Lykus (367 Mass 191, 204) the court in admitting spectrographic analysis of voice identification, said: "The considerable reliability proved by the Tosi experiment, the greatly added reliability induced by the application of further skills by the experienced examiner working under forensic conditions, and the totality of the evidence received at the voir dire hearing which tended to minimize the importance and weight of adverse or skeptical writings all serve to support a conclusion of general acceptability as required by the rule of the Fatalo and Frye cases.”
Thus, the court in Lykus seems to base its decision in large measure on its own appraisal of reliability as evidenced by the Tosi study.
Extensive hearings have been held during which highly qualified experts testified on both sides of this issue, to determine whether the technique in question has been shown to
A. GENERAL SCIENTIFIC ACCEPTANCE
At the threshold of determining whether the technique meets the test of acceptance in the scientific community, is the question of defining that community.
This court must decide whether the field of inquiry should be limited only to those scientists who actually employ the spectrograph for voice identification, or whether the field should be widened to include those scientists who use the spectrograph for other purposes in connection with the human voice. Leone (25 NY2d 511, 516, supra) pointed out that the "use of the polygraph touches areas in medicine, psychology, and sociology; yet most examiners have no training in those sciences.” Thus, Leone recognized that expertise in disciplines tangential to the one under consideration could have significant bearing on the issue under discussion.
The defense expert in the instant case, Dr. Louis Gerstman, testified that the relevant field of inquiry is acoustic phonetics, and stated that he would add experts in some other fields such as linguists and speech scientists (for obvious reasons, since human speech is after all, the subject of the test), psychologists (ostensibly to study the effects of stress on human speech), and engineers (to study the effects of distortion inherent in the electronic components associated with this test).
In addition, the testimony revealed that there were only 19 persons known to be trained voice examiners. Not all of these persons can be classified as scientists, since many are technicians and law enforcement officers, into which category falls the People’s expert Lundgren. There are perhaps a half dozen experts whom the People have put forth as members of the scientific community who support spectrography for voice identification. These include Dr. Henry Truby, who testified as a People’s witness, and Dr. Peter Ladefoged, who asserts that he, personally, can identify voices through the use of spectrography, but has not conferred this ability on anyone else.
Both Dr. Henry Truby, the People’s expert, and Dr. Louis Gerstman, the defense expert, testified that in general acoustic phoneticians would be expected to be familiar with the sound spectrograph.
In fact the sound spectrograph, developed by Bell Laborator
Dr. Truby testified that he did not know how many acoustic phoneticians there were; nor did he know how many agreed or disagreed with his position.
Dr. Gerstman testified that he was personally acquainted with approximately 400 acoustic phoneticians, which number constituted about one half of the total number of acoustic phoneticians in America. He stated that all of these scientists had achieved Ph.D. degrees.
The only major study in the area of spectrography for voice identification was conducted by Dr. Oscar Tosi, and published in the Journal of the Acoustical Society of America (Vol 51, No 6 [part 2]) in June, 1972, and is entitled "Experiment on Voice Identification.”
It should be pointed out that although many of the courts which admitted spectrographic voice identification have done so based largely on the Tosi study, this study has not been replicated, and there seems to be no other formal experimentation in this area upon which the scientific community can make an informed judgment.
Dr. Gerstman, in speaking of the importance of replication, testified that single studies, while generally thought of as being "interesting”, are generally not accepted as being true unless replicated by a different researcher in a different laboratory.
It is certainly reasonable to expect science to withhold judgment on a new theory until it has been well tested in the crucible of controlled experimentation and study. Such a procedure would require replication of original experiments, and scrutiny of the results in various scientific journals.
T.H. Huxley, the nineteenth century’s leading exponent of scientific education, said: "In scientific inquiry it becomes a matter of duty to expose a supposed law to every possible kind
The Tosi experiment is not so monumental that it could be performed but once in a lifetime. We are not discussing an experiment or study which can only be conducted under certain conditions and over a long period of time, such as certain anthropological studies made in remote and primitive locales over a period of many years. Were this such an experiment, it might be reasonable for the scientific community to be less stringent in its demands for replication.
In this case, therefore, it is not surprising that so few qualified scientists accept this technique, which acceptance would necessarily be predicated on preliminary and incomplete experimentation. In these circumstances, the mere fact that some scientists approve of the technique speaks only to their personal opinions. They, as scientists, may be personally satisfied, but such satisfaction does not and, in fact should not, bespeak general scientific acceptance.
This court believes that the failure to come forward with a great deal more evidence of scientific accpetance of this technique is sufficient to rule the test inadmissable. We have, however, substantial evidence that many members of the relevant scientific community have refused to approve a procedure based on incomplete evidence.
The defense introduced certain documents into evidence which were referred to at the hearing as the Bolt papers. (Speaker Identification by Speech Spectrograms: Some further observations, Journal of Acoustical Society of America, Vol 54, No 2.) The People stipulated that the persons who prepared those papers are respected members of the scientific community dealing with sound spectrography. Dr. Truby also recognized the authors of the Bolt papers as experts.
Indeed, without such a stipulation, this court would have recognized them as such.
Dr. Gerstman testified that the general conclusion of the scientists who prepared the Bolt papers was that the Tosi study was not responsive to questions of reliability in forensic situations and that such reliability is unestablished.
Also, according to Gerstman, Dr. Harry Hollien, who is the head of the Communication Sciences Laboratory at the Uni
Gerstman testified that to his knowledge six scientists, including Dr. Peter Ladefoged, support spectrography for voice identification. He further testified that the great majority of speech scientists believe that voice identification cannot be made with spectrograms. He said that he conversed with the six authors of the Bolt papers during the last year, and their views are held at least as strongly as they were in 1973.
Gerstman said he was able to recollect at least 40 scientists, all on the Ph.D. level and in the relevant field of inquiry, who believe that voice identification by sound spectrograph is not reliable. He named a number of such colleagues who are speech researchers, phoneticians, linguists and professors of speech. The scientists named by him were all associated with well-known instituions such as the University of California at San Diego, Yale University, City University of New York, University of Connecticut, Stanford University, University of North Carolina, Massachusetts Institute of Technology, Queens College, Bell Laboratories, University of California at Los Angeles, and the University of Michigan.
He also referred to a poll commissioned by the Technical Committee on Speech of the Acoustical Society of America.
The said poll requested that members of the society state whether they agreed with the final paragraph of the Bolt paper, or whether they agreed with the final paragraph of the rebuttal to the Bolt paper. (The rebuttal paper was introduced as a People’s exhibit, and purported to rebut the conclusion of the Bolt paper which expressed doubt as to the reliability of the technique. The rebuttal paper was entitled Reply to "Speaker Identification by Speech Spectrograms: Some further observations” [Block, Lashbrook, Nash, Oyer, Pedrey, Tosi, and Trubt, Journal of the Acoustical Society of America, Vol 54, No 2].)
Gerstman reported that out of 100 questionnaires, the results were that 89 members answered affirmatively for the Bolt paper, four answered affirmatively for the rebuttal paper, and seven did not respond.
The court notes that it has not been necessary to determine
In this instant case, however, it has been amply demonstrated that the proponents of sound spectrography for voice identification are distinctly in the minority, and that the remainder of the relevant scientific community has either expressed opposition or has expressed no opinion — perhaps necessarily, owing to the incomplete and preliminary nature of the work in this field.
This court has no choice but to hold that the technique of sound spectrography for voice identification has not been sufficiently established to have gained general acceptance in the field in which it belongs.
B. RELIABILITY
Turning to whether the reasonable accuracy of spectrographic analysis for voice identification has been clearly established, the court will bear in mind "the tremendous weight which such tests would necessarily have in the minds of a jury.” (People v Leone, 25 NY2d 511, 518, supra.)
According to the testimony of Dr. Truby the entire technique is based substantially on the premise that intraspeaker variability is never as great as interspeaker variability — therefore, while each speaker’s voice will be somewhat different each time he renders the same utterance, that difference will never be as great as the difference between the utterances of any two different speakers.
It would seem reasonable to suppose that this is true, but this fact has not been proven to the court’s satisfaction. The Tosi study is based on this assumption, and states (p 2031): "Since the parameters responsible for variabilities are not well determined and quantified, at the present time the only way to prove scientifically that 'interspeaker variability’ is greater or different than 'intraspeaker variability’ is by inference”.
Thus, in the conclusion of the Tosi paper, in discussion the experimental speaker identification error, it is stated (p 2041): "the data yielded by all levels of this experiment suggest that the spectrographic interspeaker variability is indeed greater or different than the intraspeaker variability.”
Defense exhibit "E” (Hollien, Status Report of Voiceprint Identification in the United States; Proceedings, 1977 International Conference on Crime Countermeasures, July, 1977, Oxford, England), indicates that disguise of the voice creates significant intraspeaker variability. Although this speaks to incorrect voice elimination as opposed to identification, we agree with Dr. Gerstman when he says that this fact supports the potential, that in the act of disguising one’s voice, a particular speaker will produce a spectrogram which resembles that of some other speaker.
Dr. Gerstman testified as to an article entitled "The Voice-print Mystique” (Ladefoged and Vanderslice, Working Papers in Phoentics, Dept Linguistics, University of California, Los Angeles, Nov., 1967), which reported that if two speakers with similar speech qualities and backgrounds utter the same words, there would eventually develop a "pool” of such utterances, out of which there could be drawn examples of a pair of spectrograms where the variability of two interspeaker spectrograms would be less than that of two intraspeaker examples. Such a conclusion is, to be sure, based on fragmentary indications, but it would seem that a great deal more research is necessary in this area.
Without additional independent proof the court cannot accept the assumption that interspeaker variability is always greater than intraspeaker variability.
According to the testimony, before a determination can be made as to whether or not two voices are identical, the examiner must have before him a spectrogram containing a voice sample of sufficient size.
Dr. Truby stated that there is a "standard” by which an examiner determines whether the sample is of sufficient size. He did not define that standard, or identify its characteristics, beyond saying that it was a "variable standard.” He cited an example of a case where a speaker might suffer a "laryngal spasm”, in which case a sample consisting of one syllable might be sufficient.
Mr. Lundgren testifed that the size of the sample is important in deciding whether a determination is made at all, and
It is clear that the standard, if one exists, to determine the sufficiency of the spectrographic sample, is entirely subjective. There appear to be no objective criteria to determine when there is a sufficient sample. Beyond that, it would appear that Dr. Truby, one of the leading proponents of the technique, could articulate no subjective elements or guidelines which might assist an examiner to judge the sufficiency of the sample, beyond asserting the existence of a "variable standard”.
It is not sufficient that an examiner, after extensive, training, develops a "feel” for determining the sufficiency of the sample. Certainly there is no long-standing spectrographic tradition which would elevate this technique to the status of an art. Spectrographic analysis has been put forth as a science, and it should be judged on that basis.
Another aspect of the technique which is troublesome in the extent to which aural examination of the voice sample is utilzed to assist in making an identification.
Dr. Truby stated that the aural examination played no part in the determination, except to assist the examiner in keeping his place on the spectrogram.
Mr. Lundgren said that the aural examination does play a part in matching voices, as well as in finding one’s place on the spectrogram. He said that in an audition of the voice sample, an examiner would listen for: modes of speech; pathologies such as lisps; methods of speaker formation of words; and ways to reconcile variations in the voiceprint.
Dr. Tosi’s report addressed itself to aural examination (p 2042) as follows: "On the other hand, the experimental examiners in the present study did not listen first to the unknown and known voices while processing the spectrograms for visual comparisons; i.e., in addition to spectrographic examination, he also uses aural examination during the process of reaching a conclusion. A combination of methods of voice recognition by listening and by visual inspection of spectrograms could enhance the accuracy of voice identifications.”
The mere fact that there exists a difference of opinion as to the role of aural examination of spectrograms argues that this technique is not sufficiently developed to be considered reliable.
Again, this technique is advanced as a scientific procedure practiced by those skilled in the' theory and use of the spectrograph as it visually demonstrates certain elements in human speech. The identification of voices by ear, as practised by one claiming to be an experienced and expert auditor, is another matter entirely.
Dr. Gerstman, in fact, has been a witness in certain court proceedings where he was qualified as an expert based on his many years of listening to voices on tape. In those cases, he made subjective judgments as to the identity of speakers, based solely upon aural comparison.
He testified in the instant case that on those occasions, he was not advancing any scientific conclusion, and expressed his opinion not as a scientist, but expressed a personal, though expert, opinion.
He further testified that in his opinion, error-free aural identification is not possible. He stated that he did not know whether such aural examination is scientifically reliable because it is a completely subjective method. He further opined that spectrographic identification is equally subjective.
Certainly the factor of aural comparison of voice samples is highly subjective, and we cannot be sure of the role it plays in this technique.
The Tosi report states (p 2041):
"It could be hypothesized that if, in addition to visual comparisons of spectrograms, the examiners had been allowed to listen to the unknown and known voices, these errors might have been further reduced. The experiment performed by Stevens, et al., as well as the opinion of some phoneticians and linguists who feel that speaker recognition by listening is more accurate than by visual comparison of spectrograms, seems to confirm this hypothesis. The field study performed at the Michigan Department of State Police also reinforces this hypothesis.
"A further study including forensic models similar to the ones used in the present experiment might result in important additional information if trained examiners could both listen and make visual comparisons of spectrograms.”
The court recognizes this as yet another factor which is subjective, untested, and highly questionable in its effect on the scientific nature of spectrographic analysis.
This brings us to an analysis of the Tosi report’s assertion that certain refinements would render this technique more reliable in a forensic situation than in the situation circumscribed by the Tosi experiment.
Certain cases have attached a great deal of weight to this consideration, since so much of the opposition to this technique centers around the failure of the Tosi study to closely analogize forensic conditions. (See People v Rogers, 86 Misc 2d 868; Commonwealth v Lykus, 367 Mass 191, supra; United States v Bailer, 519 F2d 463, supra.)
In examining the factors advanced by the Tosi report, the court finds that none of them are so logically compelling as to be inferential in nature.
One such refinement is aural examination, with which we have already dealt and found to be highly subjective and nonscientific.
The factor of the availability of unlimited time for examiners to render a decision (the Tosi examiners were limited to 15 minutes) is equally subjective and also speculative.
The report also cites the added responsibility of the professional examiner, saying (p 2042): "he is aware of the consequences that a wrong decision could mean to his professional status as well as the consequences to the speaker whom he might erroneously identify.”
This assertion hardly calls for rebuttal, except to say that it is inconceivable that a scientific or logical conclusion could possibly be based on such speculation.
The report also cites the fact that the Tosi examiners were forced to render a positive conclusion either identifying or eliminating the unknown speaker, while in the forensic setting the examiner is free to render no conclusion if one seems not to be warranted. Again, the conclusion that such a choice will necesarily increase reliability, and then to such an extent as to validate a scientific procedure, is not logically warranted.
These "refinements” seem to the court to be highly specula
Due to the fact that this technique relies on electronic reproduction of sound, standards of fidelity must necessarily be accounted for in determining reliability.
Mr. Lundgren testified that one of the requirements for reliability of the method is low noise and distortion. If noise and distortion are excessive, it can cause the examiner to alter his decision from "positive” to "probable”.
Dr. Truby also testified that standards of higher fidelity is an objective condition which may raise the determine from "probable” to "positive”. Dr. Truby also said that distortion is predictable and determinate, and thus does not affect the reliability of the technique.
Dr. Gerstman on the other hand indicated that there was not enough work done to determine the effects of noise, distortion, and fidelity on the production of the spectrogram.
He said that the telephone, for example, has a band pass of 300 Hz to 3,000 Hz (cycles per second). This means that sounds above and below those frequencies are excluded from telephone transmissions. This, plus noise and distortion, result, he said, in making a recording "muddy” and "shrinking the space” of the spectrogram in which to make a comparison.
Gertsman opined that such effects would tend to make spectrograms appear more and more similar.
The actual effect of noise, distortion, and fidelity on spectrographic analysis is not clear, and could actually be minimal.
But in fact we don’t know what the effect might be, and must depend solely upon the examiner’s subjective judgment to determine whether fidelity has affected his decision. More evidence is necessary in this area.
Neither is it clear whether it is necessary to compare voice samples where the subjects articulate the same words.
Dr. Truby said it is helpful but not necessary for two samples to have the same words. Normally, when an examiner compares spectrograms, he compares one word on one spectrogram with the same word represented on another spectrogram.
The court is inclined to accept Dr. Gerstman’s opinion that the words in a speech sample are indistinguishable from the way the words are being articulated, as the spectrograph
Again, research has not yet accounted for this problem.
Other factors not taken into account which affect the reliability of spectrographic analysis are the effect of disguise and mimicry; the fact that female voices have not been the subject of serious study; and the fact that no studies have been done to determine the effect of stress in the production and interpretation of spectrograms, bearing in mind that Dr. Gerstman states that stress does have an important effect on vocalization.
Perhaps just as important is the fact that the results of spectrographic analysis seem to be arrived at in an almost entirely subjective manner.
The Court of Appeals, in People v Leone (25 NY2d 511, 517, supra), in deciding against the admissibility of polygraph testing, said: "the criterion for interpretation of the test chart has not as yet become sufficiently definite to be generally reliable so as to warrant judicial acceptance”.
This is taken to mean that the standards for interpretation of the test chart must be objective rather than subjective; they must be definite rather than vague; they must be capable of being articulated; and the conclusions of any examiner must be capable of being scrutinized to determine whether they comport with a definite set of standards.
None of these conditions are met in connection with spectrographic analysis.
In spite of Dr. Truby’s assertion of the existence of a "variable standard”, there appears to be no way to articulate the precise nature of that standard.
Mr. Lundgren testified that the judgment of the examiner is "very subjective”. He said further that he could see no way to test the judgment of one expert against that of another expert, because the judgments are subjective. He stated that some of the subjective factors which must be taken into account by the examiner are fidelity; size of sample; pathologies (incidentally, there was no evidence that examiners receive any training in recognition of speech pathologies, which apparently can have a substantial impact on the analysis); and the aural examination.
Dr. Gerstman also testified that the analysis of spectrograms is subjective, and that there is no set of definite
It has been suggested that the subjective quality of the judgment made should not be an objection to its admissibility because handwriting analysis, long admissible in New York, is perhaps even more subjective.
The court notes some crucial distinctions between handwriting analysis and spectrographic analysis.
First it appears that handwriting analysis was first approved in this State 100 years ago, in the case of Miles v Loomis (75 NY 288). Even if it were approved, at that time, as a scientific test, that was long before the standards of general scientific acceptance and reliability were mandated in New York. In any event, the passage of a century has seen scientific procedure evolve to a point where all the multiplied branches of inquiry are in constant and immediate communication through a great number of associations, journals and other media; technique has become sophisticated to an extent unimaginable 100 years ago. It is entirely reasonable to now impose, therefore, more stringent standards of acceptance and reliability than would have been imposed then.
However, Miles v Loomis (supra) appears to have admitted testimony as to handwriting simply as expert opinion evidence, and not as the results of any "scientific” test.
The court said (p 295) there was no distinction as to accuracy between the mental process of a witness who is familiar with a particular handwriting, and "that of the expert who has become quick by practice in detecting identity of hands, and also compares in his mind and with his eye the one in question with other signatures as certainly genuine”.
The above process does not describe what could be called a scientific test. In fact, the court said (p 296): "It is undoubtedly true that the opinion as to handwriting should depend not so much upon mathematical measurements and minute criticisms of lines nor their exact correspondence in detail, when placed in juxtaposition with other specimens, as upon its general character and features as in the recognition of the human face. But in the case of one expert, his mental image or idea of the genuine handwriting may become as clear and vivid and accurate by an examination of the other signatures
While it is true that modern examiners of questioned documents use exact measurements and precise comparisons to assist in their analysis and support their conclusions, the mental process of the expert is fundamentally similar to the mental process of a lay witness who happens to be familiar with a particular handwriting specimen. It is to be presumed that the trier of fact could assimilate such expert testimony and accord it the weight it deserves, since handwriting is familiar to all literate persons. Thus the triers of fact have an opportunity to evaluate the subjective judgment of the handwriting expert against a background of their general experience in such matters.
Scientific tests, however, are another matter entirely. Lay persons have no general experience with the sound spectrograph. They have no knowledge of the accuracy of the spectrographic analogue with human speech. They have no knowledge of the accuracy of an examiner’s interpretation.
Leone, speaking of polygraph tests, declared (25 NY2d 511, 518, supra) "We are all aware of the tremendous weight which such tests would necessarily have in minds of a jury. Thus, we should be most careful in admitting into evidence the results of such tests unless their reasonable accuracy and general scientific acceptance are clearly recognized.”
While these standards may not necessarily apply to the venerable art of handwriting analysis, they most definitely apply to sound spectrographic analysis for voice identification and this court, finding that spectrographic analysis fails to meet these standards, hereby denies the People’s motion.