Merge pull request #15 from JarbasAl/android

android
This commit is contained in:
el-tocino 2020-12-19 11:27:58 -06:00 committed by GitHub
commit 4854e39b7f
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
8 changed files with 54 additions and 0 deletions

Binary file not shown.

View File

@ -0,0 +1,35 @@
# Android models
### EN
#### android-en-0.3-20201215-jarbas.tar.gz
Trained using a personal dataset, for not wake-word the google-commands
dataset, false_activations-20201215.tar.gz, bg_noise-20201215.tar.gz and
noises-20190814-elt.tar.gz were used
Training data contained several speakers from various countries and genders,
with about 3 samples per speaker, microphone quality and pronounciation varied
a lot per speaker
Added to this 64 samples of my own voice collected with mycroft-core during
testing of the initial model available at android-20201215-jarbas.tar.gz
=== TrainData ===
- wake_words=114
- not_wake_words=14444
- test_wake_words=38
- test_not_wake_words=14496
=== Test set results ===
- 14513 out of 14533
- 99.86%
- 0.07% false positives
- 26.32% false negatives
- False Positives: 10
- True Negatives: 14485
- False Negatives: 10
- True Positives: 28

View File

@ -0,0 +1,9 @@
I, JarbasAi, hereby agree to waive all claim of copyright (economic and moral) in all content contributed by me, the user, and immediately place any and all contributions by me into the public domain, unless otherwise noted.
I grant anyone the right to use my work for any purpose, without any conditions, to be changed or destroyed in any manner whatsoever without any attribution or notification.
No warranties are given.
Files released into public domain:
not-wake-words/noises/bg_noise-20201215-jarbas.tar.gz
not-wake-words/noises/false_activations-20201215-jarbas.tar.gz
android/en/models/android-en-0.3-20201215-jarbas.tar.gz
android/en/android-20201215-jarbas.tar.gz

View File

@ -0,0 +1,3 @@
These samples are "silence" cropped from in between wake word recordings,
meaning a long recording contained several wake word samples and this is
the silence in between

Binary file not shown.

View File

@ -0,0 +1,7 @@
These samples are false activations recorded by mycroft-core with the
"record_wake_words" flag set to True
In northern portugal it's common to say "ó XXX" instead of "hey XXX", a
model i was training learned to always trigger on the "ó", many of these
samples reflect that and should be helpful for this kind of wake word.
Other samples will contain random noises and potentially some dialog from TV