cedartech presentation
TRANSCRIPT
-
7/31/2019 CedarTech Presentation
1/45
CEDAR-FOX
A Computational Tool for QuestionedHandwriting Examination
-
7/31/2019 CedarTech Presentation
2/45
Computational Forensics
Forensic domains involving pattern matching Motivated by Importance of Quantitative methods in the
Forensic Sciences
1. Daubert Ruling2. High Standards established by DNA
3. Computers1. Low Cost
2. Advances in Artificial Intelligence/Pattern Recognition4. Improved Statistical Methods for Evidence
E.g., Aitken and Taroni, Statistics and the Evaluation ofEvidence for Forensic Scientists, Wiley, 2004
-
7/31/2019 CedarTech Presentation
3/45
Bureau of Justice Statistics (2002)-Among 50 largest publicly funded crime labs
* 57% perform QD function
* 5,231 cases requested
* 1,079 backlogged at year end
Significantly larger case load
internationally
Handwriting is common in QD case work
QDE
-
7/31/2019 CedarTech Presentation
4/45
CEDAR Research on Handwriting QDE
Quantifying discriminatory power of handwriting- Testing on national database, twins data
Feedback from QDEs in developing
computational tools- Workshops at ASQDE- JtMtg of MAFS, CAFS
- SWAFDE
Developing Statistical Evidence Theory
-
7/31/2019 CedarTech Presentation
5/45
CEDAR-FOX Software System
Principal Functions
Writer Verification/Identification
Document Properties
Signature Verification Document Search
-
7/31/2019 CedarTech Presentation
6/45
Computer System Requirements
Processor Pentium class processor
P4 or higher recommended
Operating Systems Windows NT, 2000, XP, Vista
Random Access Memory 256MB on XP and earlier 512MB on Vista
Secondary Storage 30MB available disk space
-
7/31/2019 CedarTech Presentation
7/45
Writer Verification
Known
Questioned
-
7/31/2019 CedarTech Presentation
8/45
Sample Preparation: Rule Line RemovalOriginal Ruled Text
User Control
Removed Lines
-
7/31/2019 CedarTech Presentation
9/45
Associating Truth with Word Images
Image
Truth/Transcript
TranscriptMap
-
7/31/2019 CedarTech Presentation
10/45
Transcript Mapping
-
7/31/2019 CedarTech Presentation
11/45
Extracted Characters (Letters)
-
7/31/2019 CedarTech Presentation
12/45
Features Extracted
-
7/31/2019 CedarTech Presentation
13/45
Distance and LLR Value
Distance = 0.35
LLR = -0.26Distance= 0.16
LLR = 1.49
Distance = 0.43
LLR = -0.97
-
7/31/2019 CedarTech Presentation
14/45
Histograms and PDFs of Distances
0 5 10 15 20 25 30 35 400
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
Distance
Probabilitydensity
Same writer
Different writer
0 5 10 15 20 25 30 35 400
20
40
60
80
100
120
Count
Same writer
0 5 10 15 20 25 30 35 400
50
100
150
Distance
Count
Different writer
0 0.1 0.2 0.3 0.4 0.5 0.6 0.70
0.5
1
1.5
2 x 10
4
Count
Distance
Same writer
0 0.1 0.2 0.3 0.4 0.5 0.6 0.70
0.5
1
1.5
2x 10
4
Distance
Count
Different writer
0 0.1 0.2 0.3 0.4 0.5 0.6 0.70
1
2
3
4
5
6
Distance
Probabilitydensity
Same writer
Different writer
Macrofeature: SlantMicrofeature: Letter e
Same
Writer
Different
Writer
-
7/31/2019 CedarTech Presentation
15/45
Comparison of Words
Distance = 0.3702
LLR = -0.35
Distance = 0.2022
LLR = 4.44
-
7/31/2019 CedarTech Presentation
16/45
Word Shape Comparison
-
7/31/2019 CedarTech Presentation
17/45
Bigram Shapes
Distance = 0.1996
LLR = 4.35
Distance = 0.3735
LLR = -0.47
-
7/31/2019 CedarTech Presentation
18/45
th combination and similarity score
Comparing Letter Pairs
-
7/31/2019 CedarTech Presentation
19/45
Macro Features
-
7/31/2019 CedarTech Presentation
20/45
Macro and Micro Feature Scores
PictorialAttribute
Scores
(Macro)
Letter
Formation
Scores
(Micro)
-
7/31/2019 CedarTech Presentation
21/45
Results of Verification
Feature Comparison
Table
Strength of Evidence
-
7/31/2019 CedarTech Presentation
22/45
Strength of Evidence Computation
Based on similarities ina representativedatabase of 1,500writers providing 3
pages of writing each
Probability distributions
of similarities modeledby Gamma andGaussian distributions
-
7/31/2019 CedarTech Presentation
23/45
Similar Writing of Twins
LLR = 7.15
-
7/31/2019 CedarTech Presentation
24/45
-
7/31/2019 CedarTech Presentation
25/45
Ranked Document List
Writer Identification
-
7/31/2019 CedarTech Presentation
26/45
Word Recognition
Lexicon Selection
-
7/31/2019 CedarTech Presentation
27/45
Word Comparison
And Similarity Score
Word Similarities
-
7/31/2019 CedarTech Presentation
28/45
Document Properties
-
7/31/2019 CedarTech Presentation
29/45
Document Line Structure
Document Properties
-
7/31/2019 CedarTech Presentation
30/45
User selects
Character to be displayed
Comparing Letter Formations
-
7/31/2019 CedarTech Presentation
31/45
Contour Display
-
7/31/2019 CedarTech Presentation
32/45
Query Image
Searching Documents by Word Image
-
7/31/2019 CedarTech Presentation
33/45
Searching Documents by Text Query
-
7/31/2019 CedarTech Presentation
34/45
Retrieval: Word Images Retrieval: Words (Text)Retrieval: Word Images
Query: Text Word Query: Word Image Query: Word Image
Search Modalities
-
7/31/2019 CedarTech Presentation
35/45
Genuine Set Scores for
Questioned
Signatures
Signature Matching
-
7/31/2019 CedarTech Presentation
36/45
Available
In Help
Menu
Organized by Topics
Hierarchically
User Manual
-
7/31/2019 CedarTech Presentation
37/45
Tool Bar Icons
-
7/31/2019 CedarTech Presentation
38/45
CEDAR-FOX is a system for assisting theQDE in dealing with handwriting
Has automated tools for writer/signature
verification/identification
Has tools for case-work display
Computes strength of evidence
Summary
-
7/31/2019 CedarTech Presentation
39/45
-
7/31/2019 CedarTech Presentation
40/45
Due to many functions in CEDAR-FOX it isnecessary to gain familiarity with its use
No formal training program set up yet
Competency Training
-
7/31/2019 CedarTech Presentation
41/45
Has been tested by several agencies: Canada Border Agency
FBI with results presented at ASQDE-Montreal
USSS internal testing
Trial versions with several QDEs
Further feedback solicited
QD Community Acceptance
-
7/31/2019 CedarTech Presentation
42/45
To be included in CEDAR-FOX version 1.2 Additional Tools for Image Manipulation
Eraser Tool
Database Interfaces MySQL
Upgrades to CEDAR-FOX
-
7/31/2019 CedarTech Presentation
43/45
Improved Statistical Model Current statistical model in system uses
independence assumption
Performance is not high as with better theoreticalmodels, e.g., neural networks
Plan to incorporate a compromise model e.g.,
pairwise independence
Upgrades to Software: Future Releases
-
7/31/2019 CedarTech Presentation
44/45
Future Releases: Line SegmentationImprovements
-
7/31/2019 CedarTech Presentation
45/45