5 Conclusions

In this work, we introduced a novel cartridge case comparison algorithm designed with the explicit intention to be accessible, both in terms of literal acquisition as well as comprehensibility, to fellow researchers and others in the firearm and tool mark community.

In Chapter 2, we discussed and implemented a general pipeline structure for a cartridge case comparison algorithm that adheres to the “tidy” principles of design. We demonstrated how this modularized structure makes it easy to experiment with and understand different components of the algorithm. We hope that the structure available in the cmcR R package can be used in the future to easily improve individual pieces of the pipeline rather than re-inventing the wheel with an entirely new algorithm.

In Chapter 3, we introduced a suite of visual diagnostic statistics to aid the user of the algorithm in exploring its behavior. We considered a variety of use-cases in which the diagnostic tools either indicated when a tweak to the algorithm was warranted or illuminated the similarities and differences between two cartridge case surfaces. We hope that the diagnostic tools implemented in the impressions R package and accompanying cartridgeInvestigatR web application will prove useful to both researchers and other stakeholders in understanding the inner-workings of the comparison algorithm.

In Chapter 4, we developed the Automatic Cartridge Evidence Scoring (ACES) algorithm that fuses previously-established sub-procedures of the cartridge case comparison pipeline with novel pre-processing, comparing, and scoring techniques. Using a train/test cross-validation procedure, we demonstrated how statistical models can be used to learn associations between numerical features to effectively distinguish between match and non-match comparisons. We hope that the foundation laid by the ACES comparison pipeline, and available in the scored R package, will be built upon with future feature engineering and model exploration.

In future work, we plan on exploring methods for computing score-based likelihood ratios (SLRs) as means of measuring the probativity, rather than just the similarity, for two pieces of evidence. Likelihood ratio methods take both similarity and typicality (Morrison and Enzinger 2018) into account when comparing evidence, which more directly addresses the degree to which the evidence supports the same vs. different source hypotheses. For example, suppose a sample of blood from a suspect has the same ABO type as a sample found at a crime scene. If we were to only consider blood type as a measure of similarity, then these two samples would be “perfectly” similar. Of course, this is not conclusive evidence that the suspect was actually at the crime scene - there is a relatively large probability that any two randomly drawn blood samples from different humans would having matching ABO type. That is, it is not atypical to observe this degree of similarity between two randomly drawn, different source samples. The same can be extended to cartridge case evidence. Even if two cartridge cases receive a relatively high similarity score, say 0.97, we must consider the likelihood that two randomly drawn, different source cartridge cases receive the same similarity score. If this value is also large, then the original score is not particularly probative. We hope to take a similar approach as Reinders et al. (2022) to explore various SLR computation methods.

An important avenue of exploration is in the generalizability of a model like ACES to other conditions. That is, how robust is our currently trained version of ACES to the make/model of the cartridge case or firearm, the scanning tool used, or the degree of wear/degradation on the casing? If this current iteration of ACES turns out to be sensitive to these conditions, could it be made more robust by simply training on a more representative sample of cartridge cases, or will we need to individually train one model for each combination of factors? Fundamentally, these questions point to whether breech face impressions are consistent enough that we can apply the same procedure to any pair of cartridge cases. Although we don’t have the answers to these questions, we do have the infrastructure across the various software we’ve created to start answering them.

Using ACES, we are able to achieve a much better balance between the true negative and positive classification compared to the examiner decision results reported in Baldwin et al. (2014), which had a much lower true negative rate compared to the true positive. In fact, if we use classification accuracy as an optimization during training, our results actually complement the Baldwin et al. (2014) reported results in that we obtain a near-perfect true negative rate (99.9%) and relatively high true positive rate (92.4%). Through our experimentation, it became clear to us that it is harder to algorithmically identify matching comparisons than non-matches. Specifically, it seems to us that there are many more ways in which two scans can appear different to the algorithm than similar. As we discussed in Chapter 3, extreme, non-breech face markings in the scan that aren’t removed during pre-processing may “distract” the registration procedure on which our pipeline heavily relies. Other cartridge cases might have unremarkable impressions, leading to a similarity score that doesn’t strongly indicate matches or non-matches (what forensic examiners might call an “inconclusive”). The visual diagnostic tools developed in this work make it easier to manually inspect and identify these cases. However, we are also interested in developing “automatic” techniques to characterize and identify markings. For example, we might use texture modelling and segmentation to identify distinctive markings on a surface. Alternatively, we could use a correlation measure and/or registration procedure that is more robust to extreme markings.

Morrison and Enzinger (2018) provide a compelling argument that “[p]rocedures based on similarity-only scores do not appropriately account for typicality with respect to the relevant population,” which seems to directly oppose the approach we took with ACES. However, we believe that the arguments proposed are relevant to specific types of “similarity-only” scores, such as a Euclidean distance or correlation, that are invariant to specific evidence being compared. We do not believe that ACES exhibits the same invariance as these simpler scoring procedures and we hope to explore this further in the future.

Nonetheless, we do think that the alternative approaches to measuring probativity presented in Morrison and Enzinger (2018) as well as Basu, Bolton-King, and Morrison (2022) have merit. Specifically, rather than computing comparative features that map a pair of scans to a number, such as a correlation, these authors propose computing descriptive features independently for each scan and then modeling the joint distribution of these features as a means of computing SLRs. For example, Basu, Bolton-King, and Morrison (2022) considers decomposing a cartridge case scan using various two-dimensional orthonormal bases as a means of capturing different characteristics of the surface. Doing this independently for two cartridge cases allows one to consider the joint likelihood of observing these feature values (or some dimension-reduced mapping of these features) under the same and different source hypotheses, which sidesteps distilling these features into a univariate similarity score as is done in ACES. Possible descriptive features could come from sources such as decompositions like those used in Basu, Bolton-King, and Morrison (2022), local feature detector methods such as “Speeded Up Robust Features” Bay, Tuytelaars, and Gool (2006), or even autoencoding/generative neural network models although these features would be harder to interpret.

By now, we hope that it is clear to the reader that a goal pervading every aspect of this work is transparency. Given the gravity of the application, we consider it necessary that tools used in making an evidentiary conclusion that could inform a judicial decision be as transparent and user-friendly as possible. As discussed in Chapter 2, this means structuring an algorithm to be flexible and sharing data or code if at all possible. As discussed in Chapter 3, this means creating supplementary tools that help the user understand the methods they’re applying. As discussed in Chapter 4, this means building interpretable and effective features and models. Doing so would inevitably lead to stronger collaboration between research teams, expedited improvements to the underlying methods, and a more equitable and trustworthy justice system. We hope that this work stands as a testament to the firearm and tool mark community that algorithms applicable to casework can simultaneously be effective, accessible, and approachable if intention is put towards the endeavor.

30 Magazine Clip. 2017. “Calibers Explained.” 30 Magazine Clip.

AFTE Criteria for Identification Committee. 1992. “Theory of Identification, Range Striae Comparison Reports and Modified Glossary Definitions.” AFTE Journal 24 (3): 336–40.

Allaire, JJ, Yihui Xie, R Foundation, Hadley Wickham, Journal of Statistical Software, Ramnath Vaidyanathan, Association for Computing Machinery, et al. 2021. Rticles: Article Formats for r Markdown. https://CRAN.R-project.org/package=rticles.

American Academy of Forensic Sciences. 2021. “What Is Forensic Science?” American Academy of Forensic Sciences. https://www.aafs.org/careers-forensic-science/what-forensic-science.

Anscombe, F. J. 1973. “Graphs in Statistical Analysis.” The American Statistician 27 (1): 17. https://doi.org/10.2307/2682899.

Aurich, Volker, and Jörg Weule. 1995. “Non-Linear Gaussian Filters Performing Edge Preserving Diffusion.” In Informatik Aktuell, 538–45. Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-642-79980-8_63.

Bache, Stefan Milton, and Hadley Wickham. 2022. Magrittr: A Forward-Pipe Operator for r. https://CRAN.R-project.org/package=magrittr.

Baldwin, David P., Stanley J. Bajic, Max Morris, and Daniel Zamzow. 2014. “A Study of False-Positive and False-Negative Error Rates in Cartridge Case Comparisons.” Defense Technical Information Center. https://doi.org/10.21236/ada611807.

Barthelme, Simon. 2019. Imager: Image Processing Library Based on ’CImg’. https://CRAN.R-project.org/package=imager.

Basu, Nabanita, Rachel S. Bolton-King, and Geoffrey Stewart Morrison. 2022. “Forensic Comparison of Fired Cartridge Cases: Feature-Extraction Methods for Feature-Based Calculation of Likelihood Ratios.” Forensic Science International: Synergy 5: 100272. https://doi.org/10.1016/j.fsisyn.2022.100272.

Bay, Herbert, Tinne Tuytelaars, and Luc Van Gool. 2006. “SURF: Speeded up Robust Features.” In Computer Vision - ECCV 2006, 404–17. Springer Berlin Heidelberg. https://doi.org/10.1007/11744023_32.

Beeley, Chris, and Shitalkumar R Sukhdeve. 2018. Web Application Development with R Using Shiny. 3rd ed. Birmingham, England: Packt Publishing.

Belle, Vaishak, and Ioannis Papantonis. 2021. “Principles and Practice of Explainable Machine Learning.” Frontiers in Big Data 4.

Berry, Nick, James Taylor, and Felix Baez-Santiago. 2021. Handwriter: Handwriting Analysis in r. https://CRAN.R-project.org/package=handwriter.

Breiman, Leo. 2001. “Random Forests.” Machine Learning 45 (1): 5–32. https://doi.org/10.1023/a:1010933404324.

Breiman, Leo, Jerome H. Friedman, Richard A. Olshen, and Charles J. Stone. 2017. Classification and Regression Trees. Routledge. https://doi.org/10.1201/9781315139470.

Brigham, E. Oran. 1988. The Fast Fourier Transform and Its Applications. USA: Prentice-Hall, Inc.

Brinkman, S., and H. Bodschwinna. 2003a. “Advanced Gaussian Filters.” In Advanced Techniques for Assessment Surface Topography: Development of a Basis for 3d Surface Texture Standards "SURFSTAND", edited by L. Blunt and X. Jiang. United States: Elsevier Inc. https://doi.org/10.1016/B978-1-903996-11-9.X5000-2.

———. 2003b. Advanced Techniques for Assessment Surface Topography. Elsevier. https://doi.org/10.1016/b978-1-903996-11-9.x5000-2.

Brown, Lisa Gottesfeld. 1992. “A Survey of Image Registration Techniques.” ACM Computing Surveys 24 (4): 325–76. https://doi.org/10.1145/146370.146374.

Buja, Andreas, Dianne Cook, Heike Hofmann, Michael Lawrence, Eun-Kyung Lee, Deborah F. Swayne, and Hadley Wickham. 2009. “Statistical Inference for Exploratory Data Analysis and Model Diagnostics.” Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 367 (1906): 4361–83. https://doi.org/10.1098/rsta.2009.0120.

Cadre Forensics. 2019. “Top Match-3d High Capacity: 3d Imaging and Analysis System for Firearm Forensics.” Cadre Forensics.

Chang, Andrew C., and Phillip Li. 2022. “Is Economics Research Replicable? Sixty Published Papers from Thirteen Journals Say “Often Not”.” Critical Finance Review 11 (1): 185–206. https://doi.org/10.1561/104.00000053.

Chang, Winston, Joe Cheng, JJ Allaire, Carson Sievert, Barret Schloerke, Yihui Xie, Jeff Allen, Jonathan McPherson, Alan Dipert, and Barbara Borges. 2021. Shiny: Web Application Framework for r. https://CRAN.R-project.org/package=shiny.

Chapnick, Chad, Todd J. Weller, Pierre Duez, Eric Meschke, John Marshall, and Ryan Lilien. 2020. “Results of the 3d Virtual Comparison Microscopy Error Rate (VCMER) Study for Firearm Forensics.” Journal of Forensic Sciences 66 (2): 557–70. https://doi.org/10.1111/1556-4029.14602.

Chen, Zhe, John Song, Wei Chu, Johannes A. Soons, and Xuezeng Zhao. 2017. “A Convergence Algorithm for Correlation of Breech Face Images Based on the Congruent Matching Cells (CMC) Method.” Forensic Science International 280 (November): 213–23. https://doi.org/10.1016/j.forsciint.2017.08.033.

Chu, Wei, Mingsi Tong, and John Song. 2013. “Validation Tests for the Congruent Matching Cells (CMC) Method Using Cartridge Cases Fired with Consecutively Manufactured Pistol Slides.” Journal of the Association of Firearms and Toolmarks Examiners 45 (4): 6. https://www.nist.gov/publications/validation-tests-congruent-matching-cells-cmc-method-using-cartridge-cases-fired.

Cleveland, W. S. 1994. The Elements of Graphing Data. AT&T Bell Laboratories. https://books.google.com/books?id=KMsZAQAAIAAJ.

Crawford, Amy. 2020. “Bayesian Hierarchical Modeling for the Forensic Evaluation of Handwritten Documents.” {Ph.D thesis}, Iowa State University. https://doi.org/10.31274/etd-20200624-257.

Crowder, M. J., and D. J. Hand. 1990. Analysis of Repeated Measures. Chapman & Hall/CRC Monographs on Statistics & Applied Probability. Taylor & Francis. https://books.google.com/books?id=XsGX6Jgzo-IC.

Curran, James Michael, Tacha Natalie Hicks Champod, and John S Buckleton, eds. 2000a. Forensic Interpretation of Glass Evidence. Boca Raton, FL: CRC Press.

———, eds. 2000b. Forensic Interpretation of Glass Evidence. Boca Raton, FL: CRC Press.

DeFrance, Charles, and MD Arsdale. 2003. “Validation Study of Electrochemical Rifling.” Association of Firearms and Tool Marks Examiners Journal 35 (January): 35–37.

Deng, Houtao. 2018. “Interpreting Tree Ensembles with inTrees.” International Journal of Data Science and Analytics 7 (4): 277–87. https://doi.org/10.1007/s41060-018-0144-8.

Desai, Deven R., and Joshua A. Kroll. 2017. “Trust but Verify: A Guide to Algorithms and the Law.” Harvard Journal of Law & Technology (Harvard JOLT) 31 (1): 1–64. https://ssrn.com/abstract=2959472.

Duez, Pierre, Todd Weller, Marcus Brubaker, Richard E. Hockensmith, and Ryan Lilien. 2017. “Development and Validation of a Virtual Examination Tool for Firearm Forensics, ,” Journal of Forensic Sciences 63 (4): 1069–84. https://doi.org/10.1111/1556-4029.13668.

Duvendack, Maren, Richard W. Palmer-Jones, and W. Reed. 2015. “Replications in Economics: A Progress Report.” Econ Journal Watch 12 (2).

Emerson, John W, Walton A Green, Barret Schloerke, Jason Crowley, Dianne Cook, Heike Hofmann, and Hadley Wickham. 2012. “The Generalized Pairs Plot.” Journal of Computational and Graphical Statistics 22 (1): 79–91. http://www.tandfonline.com/doi/ref/10.1080/10618600.2012.694762.

Ester, Martin, Hans-Peter Kriegel, Jörg Sander, and Xiaowei Xu. 1996. “A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise.” In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, 226–31. KDD’96. Portland, Oregon: AAAI Press.

Fadul, T., G. Hernandez, S. Stoiloff, and Gulati Sneh. 2011a. “An Empirical Study to Improve the Scientific Foundation of Forensic Firearm and Tool Mark Identification Utilizing 10 Consecutively Manufactured Slides.” https://www.ojp.gov/ncjrs/virtual-library/abstracts/empirical-study-improve-scientific-foundation-forensic-firearm-and.

———. 2011b. “An Empirical Study to Improve the Scientific Foundation of Forensic Firearm and Tool Mark Identification Utilizing 10 Consecutively Manufactured Slides.” https://www.ojp.gov/ncjrs/virtual-library/abstracts/empirical-study-improve-scientific-foundation-forensic-firearm-and.

Fernández, Alberto, Salvador Garcı́a, Mikel Galar, Ronaldo C. Prati, Bartosz Krawczyk, and Francisco Herrera. 2018. Learning from Imbalanced Data Sets. Springer International Publishing. https://doi.org/10.1007/978-3-319-98074-4.

Garton, Nathaniel, Danica Ommen, Jarad Niemi, and Alicia Carriquiry. 2020. “Score-Based Likelihood Ratios to Evaluate Forensic Pattern Evidence.” https://arxiv.org/abs/2002.09470.

“Geometrical product specifications (GPS) - Filtration - Part 61: Linear areal filters: Gaussian filters.” 2011. Standard. Vol. 2011. Geneva, CH: International Organization for Standardization.

“Geometrical product specifications (GPS) - Filtration - Part 71: Robust areal filters: Gaussian regression filters.” 2014. Standard. Vol. 2014. Geneva, CH: International Organization for Standardization. https://www.iso.org/standard/60159.html.

“Geometrical product specifications (GPS) — Surface texture: Areal — Part 72: XML file format x3p.” 2017. Standard. Vol. 2014. Geneva, CH: International Organization for Standardization. https://www.iso.org/standard/62310.html.

Goldstein, E, and James Brockmole. 2016. Sensation and Perception. 10th ed. Mason, OH: CENGAGE Learning Custom Publishing.

Goode, Katherine, and Heike Hofmann. 2021. “Visual Diagnostics of an Explainer Model: Tools for the Assessment of LIME Explanations.” Statistical Analysis and Data Mining: The ASA Data Science Journal 14 (2): 185–200. https://doi.org/10.1002/sam.11500.

Goodman, Steven N., Daniele Fanelli, and John P. A. Ioannidis. 2016. “What Does Research Reproducibility Mean?” Science Translational Medicine 8 (341): 341ps12–12. https://doi.org/10.1126/scitranslmed.aaf5027.

Goor, Robert, Douglas Hoffman, and George Riley. 2020. “Novel Method for Accurately Assessing Pull-up Artifacts in STR Analysis.” Forensic Science International: Genetics 51 (November): 102410. https://doi.org/10.1016/j.fsigen.2020.102410.

Grüning, Björn, John Chilton, Johannes Köster, Ryan Dale, Nicola Soranzo, Marius van den Beek, Jeremy Goecks, Rolf Backofen, Anton Nekrutenko, and James Taylor. 2018. “Practical Computational Reproducibility in the Life Sciences.” Cell Systems 6 (6): 631–35. https://doi.org/10.1016/j.cels.2018.03.014.

Gundersen, Odd Erik, Yolanda Gil, and David W. Aha. 2018. “On Reproducible AI: Towards Reproducible Research, Open Science, and Digital Scholarship in AI Publications.” AI Magazine 39 (3): 56–68. https://doi.org/10.1609/aimag.v39i3.2816.

Hadler, Jeremy R., and Max D. Morris. 2017. “An Improved Version of a Tool Mark Comparison Algorithm.” Journal of Forensic Sciences 63 (3): 849–55. https://doi.org/10.1111/1556-4029.13640.

Hamby, James E., David J. Brundage, and James W. Thorpe. 2009. “The Identification of Bullets Fired from 10 Consecutively Rifled 9mm Ruger Pistol Barrels: A Research Project Involving 507 Participants from 20 Countries.” In AFTE Journal, 41:99–110.

Hampton, Della. 2016. “Firearms Identification. A Discipline Mainly Concerned with Determining Whether a Bullet or Cartridge Was Fired by a Particular Weapon. - Ppt Download.” SlidePlayer.

Haralick, Robert M., Stanley R. Sternberg, and Xinhua Zhuang. 1987. “Image Analysis Using Mathematical Morphology.” IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI-9 (4): 532–50. https://doi.org/10.1109/tpami.1987.4767941.

Hare, Eric, Heike Hofmann, and Alicia Carriquiry. 2017. “Automatic Matching of Bullet Land Impressions.” The Annals of Applied Statistics 11 (4): 2332–56. http://arxiv.org/abs/1601.05788.

Hastie, Trevor, Robert Tibshirani, and Jerome Friedman. 2001. The Elements of Statistical Learning. Springer Series in Statistics. New York, NY, USA: Springer New York Inc.

Hesselink, Wim H., Arnold Meijster, and Coenraad Bron. 2001. “Concurrent Determination of Connected Components.” Science of Computer Programming 41 (2): 173–94. https://doi.org/10.1016/S0167-6423(01)00007-7.

Hofmann, Heike, Alicia Carriquiry, and Susan Vanderplas. 2021. “Treatment of inconclusives in the AFTE range of conclusions.” Law, Probability and Risk 19 (3-4): 317–64. https://doi.org/10.1093/lpr/mgab002.

Hofmann, Heike, Susan Vanderplas, Ganesh Krishnan, and Eric Hare. 2020. x3ptools: Tools for Working with 3D Surface Measurements. https://cran.r-project.org/web/packages/x3ptools/index.html.

Huber, Wolfgang, Vincent J Carey, Robert Gentleman, Simon Anders, Marc Carlson, Benilton S Carvalho, Hector Corrada Bravo, et al. 2015. “Orchestrating High-Throughput Genomic Analysis with Bioconductor.” Nature Methods 12 (2): 115–21. https://doi.org/10.1038/nmeth.3252.

Indiana County Court of Common Pleas. 2009. Commonwealth of Pennsylvania Vs. Kevin j. Foley.

Iqbal, Shareen A., Joshua D. Wallach, Muin J. Khoury, Sheri D. Schully, and John P. A. Ioannidis. 2016. “Reproducible Research Practices and Transparency Across the Biomedical Literature.” Edited by David L Vaux. PLOS Biology 14 (1): e1002333. https://doi.org/10.1371/journal.pbio.1002333.

James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2013. An Introduction to Statistical Learning: With Applications in r. Springer. https://faculty.marshall.usc.edu/gareth-james/ISL/.

Knowles, Laura, Daniel Hockey, and John Marshall. 2021. “The Validation of 3d Virtual Comparison Microscopy (VCM) in the Comparison of Expended Cartridge Cases.” Journal of Forensic Sciences 67 (2): 516–23. https://doi.org/10.1111/1556-4029.14942.

Krishnan, Ganesh, and Heike Hofmann. 2018. “Adapting the Chumbley Score to Match Striae on Land Engraved Areas (LEAs) of Bullets,” Journal of Forensic Sciences 64 (3): 728–40. https://doi.org/10.1111/1556-4029.13950.

Kuhn, Max. 2022. Caret: Classification and Regression Training. https://CRAN.R-project.org/package=caret.

Kwong, Katherine. 2017. “The Algorithm Says You Did It: The Use of Black Box Algorithms to Analyze Complex DNA Evidence Notes.” Harvard Journal of Law & Technology (Harvard JOLT) 31 (1): 275–302. https://jolt.law.harvard.edu/assets/articlePDFs/v31/31HarvJLTech275.pdf.

Leek, Jeffrey T., and Leah R. Jager. 2017. “Is Most Published Research Really False?” Annual Review of Statistics and Its Application 4 (1): 109–22. https://doi.org/10.1146/annurev-statistics-060116-054104.

Liaw, Andy, and Matthew Wiener. 2002. “Classification and Regression by randomForest.” R News 2 (3): 18–22. https://CRAN.R-project.org/doc/Rnews/.

MacQueen, J. B. 1967. “Some Methods for Classification and Analysis of MultiVariate Observations.” In Proc. Of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, edited by L. M. Le Cam and J. Neyman, 1:281–97. University of California Press.

Mattijssen, Erwin J. A. T., Cilia L. M. Witteman, Charles E. H. Berger, Nicolaas W. Brand, and Reinoud D. Stoel. 2020. “Validity and Reliability of Forensic Firearm Examiners.” Forensic Science International 307: 110112. https://www.sciencedirect.com/science/article/pii/S0379073819305249.

Midway, Stephen R. 2020. “Principles of Effective Data Visualization.” Patterns 1 (9): 100141. https://doi.org/10.1016/j.patter.2020.100141.

Morrison, Geoffrey Stewart, and Ewald Enzinger. 2018. “Score Based Procedures for the Calculation of Forensic Likelihood Ratios - Scores Should Take Account of Both Similarity and Typicality.” Science & Justice 58 (1): 47–58. https://doi.org/10.1016/j.scijus.2017.06.005.

National Academy of Sciences, Engineering, and Medicine. 2019. Reproducibility and Replicability in Science. National Academies Press. https://doi.org/10.17226/25303.

National Research Council. 2009. Strengthening Forensic Science in the United States: A Path Forward. Washington, DC: The National Academies Press. https://doi.org/10.17226/12589.

Neuman, Maddisen, Callan Hundl, Aimee Grimaldi, Donna Eudaley, Darrell Stein, and Peter Stout. 2022. “Blind Testing in Firearms: Preliminary Results from a Blind Quality Control Program.” Journal of Forensic Sciences 67 (3): 964–74. https://doi.org/10.1111/1556-4029.15031.

Ommen, Danica M, and Christopher P Saunders. 2018. “Building a Unified Statistical Framework for the Forensic Identification of Source Problems.” Law, Probability and Risk 17 (2): 179–97. https://doi.org/10.1093/lpr/mgy008.

OSAC Human Factors Committee. 2020. “Human Factors in Validation and Performance Testing of Forensic Science.” Organization of Scientific Area Committees (OSAC) for Forensic Science. https://doi.org/10.29325/osac.ts.0004.

Ott, Daniel, Robert Thompson, and Junfeng Song. 2017. “Applying 3d Measurements and Computer Matching Algorithms to Two Firearm Examination Proficiency Tests.” Forensic Science International 271 (February): 98–106. https://doi.org/10.1016/j.forsciint.2016.12.014.

Park, Soyoung, and Alicia Carriquiry. 2020. “An Algorithm to Compare Two-Dimensional Footwear Outsole Images Using Maximum Cliques and Speeded-up Robust Feature.” Statistical Analysis and Data Mining: The ASA Data Science Journal 13 (2): 188–99. https://doi.org/10.1002/sam.11449.

Park, Soyoung, and Sam Tyner. 2019. “Evaluation and Comparison of Methods for Forensic Glass Source Conclusions.” Forensic Science International 305 (December): 110003. https://doi.org/10.1016/j.forsciint.2019.110003.

Peng, Roger D. 2009. “Reproducible Research and Biostatistics.” Biostatistics 10 (3): 405–8. https://doi.org/10.1093/biostatistics/kxp014.

———. 2011. “Reproducible Research in Computational Science.” Science 334 (6060): 1226–27. https://www.jstor.org/stable/41352177.

Piccolo, Stephen R., and Michael B. Frampton. 2016. “Tools and Techniques for Computational Reproducibility.” GigaScience 5 (1). https://doi.org/10.1186/s13742-016-0135-4.

President’s Council of Advisors on Sci. & Tech. 2016. “Forensic Science in Criminal Courts: Ensuring Scientific Validity of Feature-Comparison Methods.” https://obamawhitehouse.archives.gov/sites/default/files/microsites/ostp/PCAST/pcast_forensic_science_report_final.pdf.

Puiutta, Erika, and Eric M. S. P. Veith. 2020. “Explainable Reinforcement Learning: A Survey.” In Lecture Notes in Computer Science, 77–95. Springer International Publishing. https://doi.org/10.1007/978-3-030-57321-8_5.

R Core Team. 2017. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.

———. 2023. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.

Rattenbury, Richard C. 2015. “Semiautomatic Pistol.” Encyclopædia Britannica. Encyclopædia Britannica, inc.

Reinders, Stephanie, Yong Guan, Danica Ommen, and Jennifer Newman. 2022. “Source-Anchored, Trace-Anchored, and General Match Score-Based Likelihood Ratios for Camera Device Identification.” Journal of Forensic Sciences 67 (3): 975–88. https://doi.org/10.1111/1556-4029.14991.

Ribeiro, Marco Tulio, Sameer Singh, and Carlos Guestrin. 2016. “"Why Should i Trust You?": Explaining the Predictions of Any Classifier.” In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1135–44. KDD ’16. New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/2939672.2939778.

Rice, Kiegan E. 2020. “A Framework for Statistical and Computational Reproducibility in Large-Scale Data Analysis Projects with a Focus on Automated Forensic Bullet Evidence Comparison.” ProQuest Dissertations and Theses. PhD thesis.

Riva, Fabiano, and Christophe Champod. 2014. “Automatic Comparison and Evaluation of Impressions Left by a Firearm on Fired Cartridge Cases.” Journal of Forensic Sciences 59 (3): 637–47. https://doi.org/10.1111/1556-4029.12382.

Riva, Fabiano, Rob Hermsen, Erwin Mattijssen, Pascal Pieper, and Christophe Champod. 2016. “Objective Evaluation of Subclass Characteristics on Breech Face Marks.” Journal of Forensic Sciences 62 (2): 417–22. https://doi.org/10.1111/1556-4029.13274.

Riva, Fabiano, Erwin J. A. T. Mattijssen, Rob Hermsen, Pascal Pieper, W. Kerkhoff, and Christophe Champod. 2020. “Comparison and Interpretation of Impressed Marks Left by a Firearm on Cartridge Cases Towards an Operational Implementation of a Likelihood Ratio Based Technique.” Forensic Science International 313 (August): 110363. https://doi.org/10.1016/j.forsciint.2020.110363.

Roth, Joseph, Andrew Carriveau, Xiaoming Liu, and Anil K. Jain. 2015. “Learning-Based Ballistic Breech Face Impression Image Matching.” In 2015 IEEE 7th International Conference on Biometrics Theory, Applications and Systems (BTAS), 1–8. https://doi.org/10.1109/BTAS.2015.7358774.

Schloerke, Barret, Di Cook, Joseph Larmarange, Francois Briatte, Moritz Marbach, Edwin Thoen, Amos Elberg, and Jason Crowley. 2021. GGally: Extension to ’Ggplot2’. https://CRAN.R-project.org/package=GGally.

Smith, Tasha P., G. Andrew Smith, and Jeffrey B. Snipes. 2016. “A Validation Study of Bullet and Cartridge Case Comparisons Using Samples Representative of Actual Casework.” Journal of Forensic Sciences 61 (4): 939–46. https://doi.org/10.1111/1556-4029.13093.

Song, John. 2013. “Proposed ‘NIST Ballistics Identification System (NBIS)’ Based on 3d Topography Measurements on Correlation Cells.” American Firearm and Tool Mark Examiners Journal 45 (2): 11. https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=910868.

Song, John, Wei Chu, Mingsi Tong, and Johannes Soons. 2014. “3d Topography Measurements on Correlation Cells—a New Approach to Forensic Ballistics Identifications.” Measurement Science and Technology 25 (6): 064005. https://doi.org/10.1088/0957-0233/25/6/064005.

Song, John, Theodore V. Vorburger, Wei Chu, James Yen, Johannes A. Soons, Daniel B. Ott, and Nien Fan Zhang. 2018. “Estimating Error Rates for Firearm Evidence Identifications in Forensic Science.” Forensic Science International 284 (March): 15–32. https://doi.org/10.1016/j.forsciint.2017.12.013.

Stodden, Victoria, Peixuan Guo, and Zhaokun Ma. 2013. “Toward Reproducible Computational Research: An Empirical Analysis of Data and Code Policy Adoption by Journals.” Edited by Dmitri Zaykin. PLoS ONE 8 (6): e67111. https://doi.org/10.1371/journal.pone.0067111.

Stodden, Victoria, Matthew S. Krafczyk, and Adhithya Bhaskar. 2018. “Enabling the Verification of Computational Results.” In Proceedings of the First International Workshop on Practical Reproducible Evaluation of Computer Systems. ACM. https://doi.org/10.1145/3214239.3214242.

Stodden, Victoria, Jennifer Seiler, and Zhaokun Ma. 2018. “An Empirical Analysis of Journal Policy Effectiveness for Computational Reproducibility.” Proceedings of the National Academy of Sciences 115 (11): 2584–89. https://doi.org/10.1073/pnas.1708290115.

Stroman, A. 2014. “Empirically Determined Frequency of Error in Cartridge Case Examinations Using a Declared Double-Blind Format.” AFTE Journal 46 (January): 157–75.

Swofford, H., and C. Champod. 2021. “Implementation of Algorithms in Pattern & Impression Evidence: A Responsible and Practical Roadmap.” Forensic Science International: Synergy 3: 100142. https://doi.org/10.1016/j.fsisyn.2021.100142.

Tai, Xiao Hui. 2019. “Matching Problems in Forensics.” PhD thesis, Carnegie Mellon University. https://kilthub.cmu.edu/articles/Matching_Problems_in_Forensics/9963596/1.

———. 2021. cartridges3D: Algorithm to Compare Cartridge Case Images. https://github.com/xhtai/cartridges3D.

Tai, Xiao Hui, and William F. Eddy. 2018. “A Fully Automatic Method for Comparing Cartridge Case Images,” Journal of Forensic Sciences 63 (2): 440–48. http://doi.wiley.com/10.1111/1556-4029.13577.

Telea, Alexandru C. 2014. Data Visualization: Principles and Practice. CRC Press.

The Linux Foundation. 2017. “Using open source software to speed up development and gain business advantage.” https://www.linuxfoundation.org/blog/using-open-source-software-to-speed-development-and-gain-business-advantage/.

Therneau, Terry, and Beth Atkinson. 2022. Rpart: Recursive Partitioning and Regression Trees. https://CRAN.R-project.org/package=rpart.

Thompson, Robert. 2017. Firearm Identification in the Forensic Science Laboratory. National District Attorneys Association. https://doi.org/10.13140/RG.2.2.16250.59846.

Tong, Mingsi, John Song, and Wei Chu. 2015. “An Improved Algorithm of Congruent Matching Cells (CMC) Method for Firearm Evidence Identifications.” Journal of Research of the National Institute of Standards and Technology 120 (April): 102. https://doi.org/10.6028/jres.120.008.

Tong, Mingsi, John Song, Wei Chu, and Robert M. Thompson. 2014. “Fired Cartridge Case Identification Using Optical Images and the Congruent Matching Cells (CMC) Method.” Journal of Research of the National Institute of Standards and Technology 119 (November): 575. https://doi.org/10.6028/jres.119.023.

Tvedebrink, Torben, Mikkel Meyer Andersen, and James Michael Curran. 2020. “DNAtools: Tools for Analysing Forensic Genetic DNA Data.” Journal of Open Source Software 5 (45): 1981. https://doi.org/10.21105/joss.01981.

Tyner, Sam, Soyoung Park, Ganesh Krishnan, Karen Pan, Eric Hare, Amanda Luby, Xiao Hui Tai, Heike Hofmann, and Guillermo Basulto-Elias. 2019. “Sctyner/OpenForSciR: Create DOI for Open Forensic Science in r.” Zenodo. https://zenodo.org/record/3418141.

Ulery, Bradford T., R. Austin Hicklin, JoAnn Buscaglia, and Maria Antonia Roberts. 2011. “Accuracy and Reliability of Forensic Latent Fingerprint Decisions.” Proceedings of the National Academy of Sciences 108 (19): 7733–38. https://doi.org/10.1073/pnas.1018707108.

———. 2012. “Repeatability and Reproducibility of Decisions by Latent Fingerprint Examiners.” Edited by Chuhsing Kate Hsiao. PLoS ONE 7 (3): e32800. https://doi.org/10.1371/journal.pone.0032800.

Ulery, Bradford T., R. Austin Hicklin, Maria Antonia Roberts, and JoAnn Buscaglia. 2014. “Measuring What Latent Fingerprint Examiners Consider Sufficient Information for Individualization Determinations.” Edited by Francesco Pappalardo. PLoS ONE 9 (11): e110179. https://doi.org/10.1371/journal.pone.0110179.

Vanderplas, Susan, Melissa Nally, Tylor Klep, Cristina Cadevall, and Heike Hofmann. 2020. “Comparison of Three Similarity Scores for Bullet LEA Matching.” Forensic Science International, January. https://doi.org/10.1016/j.forsciint.2020.110167.

Venables, W. N., and B. D. Ripley. 2002. Modern Applied Statistics with s. Fourth. New York: Springer. http://www.stats.ox.ac.uk/pub/MASS4.

Vorburger, T V, J Song, and N Petraco. 2015. “Topography Measurements and Applications in Ballistics and Tool Mark Identifications.” Surface Topography: Metrology and Properties 4 (1): 013002. https://doi.org/10.1088/2051-672x/4/1/013002.

Vorburger, T V, J H Yen, B Bachrach, T B Renegar, J J Filliben, L Ma, H G Rhee, et al. 2007. “Surface Topography Analysis for a Feasibility Assessment of a National Ballistics Imaging Database.” NIST IR 7362. Gaithersburg, MD: National Institute of Standards; Technology. https://doi.org/10.6028/NIST.IR.7362.

Weller, Todd J., Alan Zheng, Robert Thompson, and Fred Tulleners. 2012. “Confocal Microscopy Analysis of Breech Face Marks on Fired Cartridge Cases from 10 Consecutively Manufactured Pistol Slides” 57 (4). https://doi.org/10.1111/j.1556-4029.2012.02072.x.

Weller, Todd, Marcus Brubaker, Pierre Duez, and Ryan Lilien. 2015. “Introduction and Initial Evaluation of a Novel Three-Dimensional Imaging and Analysis System for Firearm Forensics.” AFTE Journal 47 (January): 198.

Werner, Denis, Romain Berthod, Damien Rhumorbarbe, and Alain Gallusser. 2021. “Manufacturing of Firearms Parts: Relevant Sources of Information and Contribution in a Forensic Context.” WIREs Forensic Science 3 (3): e1401. https://wires.onlinelibrary.wiley.com/doi/abs/10.1002/wfs2.1401.

Wickham, Hadley. 2009. Ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. http://ggplot2.org.

———. 2014. “Tidy Data.” The Journal of Statistical Software 59. http://www.jstatsoft.org/v59/i10/.

Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy D’Agostino McGowan, Romain François, Garrett Grolemund, et al. 2019. “Welcome to the tidyverse.” Journal of Open Source Software 4 (43): 1686.

Wilkinson, Leland. 2005. The Grammar of Graphics. Berlin, Heidelberg: Springer-Verlag.

Xiao Hui Tai. 2018. “Comparing Cartridge Breechface Marks: 2d Versus 3d.” Center for Statistics; Applications in Forensic Evidence. January 31, 2018. https://forensicstats.org/blog/portfolio/comparing-cartridge-breechface-marks-2d-versus-3d/.

Xie, Yihui. 2014a. “Knitr: A Comprehensive Tool for Reproducible Research in R.” In Implementing Reproducible Computational Research, edited by Victoria Stodden, Friedrich Leisch, and Roger D. Peng. Chapman; Hall/CRC. http://www.crcpress.com/product/isbn/9781466561595.

———. 2014b. “Knitr: A Comprehensive Tool for Reproducible Research in R.” In Implementing Reproducible Computational Research, edited by Victoria Stodden, Friedrich Leisch, and Roger D. Peng. Chapman; Hall/CRC. http://www.crcpress.com/product/isbn/9781466561595.

———. 2015. Dynamic Documents with R and Knitr. 2nd ed. Boca Raton, Florida: Chapman; Hall/CRC. https://yihui.org/knitr/.

Zemmels, Joe, Heike Hofmann, and Susan VanderPlas. 2022. cmcR: An Implementation of the ’Congruent Matching Cells’ Method.

Zemmels, Joseph, Heike Hofmann, and Susan Vanderplas. 2022. “Zemmels et al. (2023) Cartridge Case Scans.” Iowa State University. https://doi.org/10.25380/IASTATE.19686297.V1.

Zemmels, Joseph, Susan VanderPlas, and Heike Hofmann. 2023. “A Study in Reproducibility: The Congruent Matching Cells Algorithm and cmcR Package.” The R Journal 14 (4): 79–102. https://doi.org/10.32614/rj-2023-014.

Zhang, Hao, Jialing Zhu, Rongjing Hong, Hua Wang, Fuzhong Sun, and Anup Malik. 2021. “Convergence-Improved Congruent Matching Cells (CMC) Method for Firing Pin Impression Comparison.” Journal of Forensic Sciences 66 (2): 571–82. https://doi.org/10.1111/1556-4029.14634.

Zheng, Xiaoyu A., Johannes A. Soons, and Robert M. Thompson. 2016. “NIST Ballistics Toolmark Research Database.” https://tsapps.nist.gov/NRBTD/.

Zheng, Xiaoyu, Johannes Soons, Robert Thompson, Sushama Singh, and Cerasela Constantin. 2020. “NIST Ballistics Toolmark Research Database.” Journal of Research of the National Institute of Standards and Technology 125 (January). https://doi.org/10.6028/jres.125.004.

Zheng, X, J Soons, T V Vorburger, J Song, T Renegar, and R Thompson. 2014. “Applications of Surface Metrology in Firearm Identification.” Surface Topography: Metrology and Properties 2 (1): 014012. https://doi.org/10.1088/2051-672x/2/1/014012.

Zimmerman, Naupaka, Greg Wilson, Raniere Silva, Scott Ritchie, François Michonneau, Jeffrey Oliver, Harriet Dashnow, et al. 2019. “Swcarpentry/r-Novice-Gapminder: Software Carpentry: R for Reproducible Scientific Analysis, June 2019.” Zenodo. https://zenodo.org/record/3265164.

A Cartridge Case Comparison Pipeline

5 Conclusions

References