random util
some stuff from my history
3881 xq ~/Downloads/yarle-test/input/foo_export_test.enex 'del(.note)'
3882 xq 'del(.note)' ~/Downloads/yarle-test/input/foo_export_test.enex
3883 xq 'del(.en-export)' ~/Downloads/yarle-test/input/foo_export_test.enex
3888 xq 'del(.en-export)' ~/Downloads/yarle-test/input/foo_export_test.enex
3889 xq 'del(.note)' ~/Downloads/yarle-test/input/foo_export_test.enex
3890 xq 'del(."en-export".note)' ~/Downloads/yarle-test/input/foo_export_test.enex
3891 xq 'del(."en-export".note)' ~/Downloads/yarle-test/input/export_test.enex
3892 xq ~/Downloads/yarle-test/input/export_test.enex
3893 xq '.' ~/Downloads/yarle-test/input/export_test.enex
3898 less Blog.enex
➜ ~ ls -l /opt/homebrew/bin/xq
lrwxr-xr-x 1 tjen admin 32 3 jul 19:25 /opt/homebrew/bin/xq -> ../Cellar/python-yq/3.4.3/bin/xq
strip content from xml
xq 'del(."en-export".note[].content)' ~/Downloads/yarle-run/input/evernote.toc/all/Blog.enex
{ “en-export”: { “@export-date”: “20231213T100406Z”, “@application”: “Evernote”, “@version”: “10.66.3”, “note”: [ { “title”: “Table of Contents”, “created”: “20240704T231604+01:00”, “updated”: “20240704T231604+01:00”, “note-attributes”: { “author”: “evernote-toc”, “source”: “desktop.mac” } }, { “title”: “Conference: SocratesBE 2022”, “created”: “20220710T120501Z”, “updated”: “20231109T160405Z”, “tag”: [ “published”, “i.conference”, “conference.socratesbe22” ], “note-attributes”: { “author”: “Tjen Wellens”, “source”: “desktop.mac” }, “resource”: [ { “data”: { “@encoding”: “base64”,
strip content and resources + keep output xml
xq --xml-output 'del(."en-export".note[].resource, ."en-export".note[].content)' ~/Downloads/yarle-run/input/evernote.toc/all/Blog.enex > blog-without-resources-or-cont
ent.xml
xq on .enex
remove giant ass resource data, store in json for faster access
xq 'del(."en-export".note[].resource)' ~/Downloads/yarle-run/input/evernote.toc/all/Blog.enex | tee all-without-resources.json
show only notes with title “Untitled Note”
jq '."en-export".note[] | select(.title == "Untitled Note")' all-without-resources.json
> untitled.json
can I use date to track down?
cat untitled.json | jq '.created' | sort | uniq -c
all notes creation date that are not unique
cat all-without-resources.json | jq '."en-export".note[].created' | sort | uniq -c | grep -v "^ *1 "
2 “20180603T122012Z” 2 “20190316T171536Z” 11 “20190718T053000Z” 3 “20200409T100000Z” 2 “20201124T144351Z” 2 “20201224T110246Z” 5 “20210103T140945Z” 2 “20210103T152022Z” 2 “20220714T080621Z” 2 “20220718T112836Z” 7 “20220821T104335Z” 3 “20220821T111151Z” 14 “20220825T090202Z”
cat untitled.json | jq -r '.created' | sort > dates.untitled.txt
cat all-without-resources.json | jq -r '."en-export".note[].created' | sort | uniq -c | grep -v "^ *1 " | sed 's/^ *[0-9][0-9]* //' > dates.all.duplicates.txt
find common lines in both files
comm -12 dates.all.duplicates.txt dates.untitled.txt
20220821T104335Z -> 7 “20220821T104335Z” -> so only one file title I cannot match purely on date, but need to check 7 others
date + evernote_toc metadata
cp ~/dev/tjen/evernote-toc/out/Blog.json ./
cat Blog.json| jq '.notes[] | {title, created: (.created / 1000 ) | strftime("%Y%m%dT%H%M%SZ") }'
{ “title”: “Google IO: Team Geek”, “created”: “20141225T190200Z” } { “title”: “On Martial Arts and knife fighting”, “created”: “20160608T195129Z” } { “title”: “Speed reading”, “created”: “20161005T133346Z” }
filter untitled note titles (partial) get timestamps of missing notes
cat dates.untitled.txt | sed 's/\(.*\)/"\1",/' | tr -d '\n' | sed -e 's/^/[/' -e 's/,$/]/'
[“20130814T193511Z”,…,“20220831T201458Z”] (partial) filter note + title by timestamp
cat Blog.json| jq -r '.notes[] | {title, created: (.created / 1000 ) | strftime("%Y%m%dT%H%M%SZ") } | select(.created as $created | ["20130814T193511Z"] | index($created) )'
(full) filter untitled note titles, sorted by created time
cat Blog.json| jq -r '.notes[] | {title, created: (.created / 1000 ) | strftime("%Y%m%dT%H%M%SZ") } | select(.created as $created | '"$(cat dates.untitled.txt | sed 's/\(.*\)/"\1",/' | tr -d '\n' | sed -e 's/^/[/' -e 's/,$/]/')"' | index($created) )' | jq -s 'sort_by(.date)' |tee missing-titles-with-dates.json
which titles are missing?
cat ~/Downloads/yarle-run/output2/notes/Blog/Table_of_Contents.md | head
delete frontmatter
-e '/---/,/---/d'
delete stuff from line, keep only title
-e 's/^[0-9][0-9]*\. \[\[.*.md|\(.*\)\]\]$/\1/'
get titles from table_of_contents.md
cat ~/Downloads/yarle-run/output2/notes/Blog/Table_of_Contents.md | sed -e '/---/,/---/d' -e 's/^[0-9][0-9]*\. \[\[.*.md|\(.*\)\]\]$/\1/' | sort > toc-titles.txt
get titles from enex.json file (translated some characters to match toc)
jq -r '."en-export".note[].title' all-without-resources.json | tr ':?"/|' '_' |tr -d '#[]' | sed -e 's/\.*\.$/_/' -e 's/ / /g' | sort > enex-titles.txt
find diffs
diff --width=$COLUMNS --suppress-common-lines --side-by-side --color=always enex-titles.txt toc-titles.txt
diff enex-titles.txt toc-titles.txt | grep "^<" | wc -l
diff enex-titles.txt toc-titles.txt | grep "^>" | wc -l
29 # because note Table_of_Contents was hacked into the enex, but does not contain itself 28
these notes are screwed up
diff enex-titles.txt toc-titles.txt | grep "^<" | sed 's/^> //' | sort |
uniq -c
1 < Table of Contents 28 < Untitled Note
These are the titles that should have been there
diff enex-titles.txt toc-titles.txt | grep "^>" | sed 's/^> //'
Article_ We are creators - André Chaperon, Shawn Twing Blog topic categories Book_ The Art of Empathy - Karla McLaren Book_ The Start-Up J Curve - Howard Love Book_ This is LEAN - Niklas Modig & Par Ahlstrom Can we use an Andon Cord in programming_ Changing culture by selecting Hero Stories Coding Pricinples (zooming patterns) Course_ Cloud Native Entrepreneur - Patrick Lee Scott IMG_20220729_193455.jpeg Metaphor_ Catching the big fish vs learning to fish Model_ 3 Axes of software development Model_ 3X - software stages Model_ Action Requiring Neurological Program Model_ Grouping tests Model_ Hypothesis-driven delivery Model_ incremental vs iterative Model_ naming tests Model_ primary vs secondary needs Model_ tests must clearly express required functionality Model_ workgroup vs team Opinion_ Where there is blame, there is no learning Opinion_ todo - doing - done is too limited TDD styles books The 7 Habits of Highly Effective People Video_ Code as Risk • Kevlin Henney Video_ The Infinite Game_ How to Lead in the 21st Century - Simon Sinek Video_ The Secret Assumption of Agile - Fred George
replace inline with xq
WIP fragment
cat blog-without-resources-or-content.xml | xq '{"20240704T231604+01:00":"foo","20220710T120501Z":"bar"} as $created_title |."en-export".note[]|=(. + {title: (if $created_title[.created] then $created_title[.created] else .created end ) } )' | head -n 20
WIP step prep created_title.json
cat Blog.json| jq -r '.notes[] | {title, created: (.created / 1000 ) | strftime("%Y%m%dT%H%M%SZ") } | select(.created as $created | '"$(cat dates.untitled.txt | sed 's/\(.*\)/"\1",/' | tr -d '\n' | sed -e 's/^/[/' -e 's/,$/]/')"' | index($created) )' | jq -sr 'sort_by(.date) | map({(.created|tostring):.title}) | add' |sed -e 's/^ *//' -e 's/: /:/' -e 's/ *$//' | tr -d '\n' > created_title.json
WIP done
cat blog-without-resources-or-content.xml | xq ''"$(cat created_title.json)"' as $created_title |."en-export".note[]|=(. + {title: (if (.title == "Untitled Note") then $created_title[.created] else .title end ) } )' | less
nice about this method (check if untitled note, and only then do lookup)
- does not matter if other notes have the same date
- as long as the notes with “untitled note” have unique dates it’s fine!
WIP speedup
cat blog-without-resources-or-content.xml | xq "$(cat created_title.json)"' as $created_title |."en-export".note[]|=(. + if .title == "Untitled Note" then {title:$created_title[.created]} else {} end )' | less
full solution keeping xml
xq --xml-output "$(cat created_title.json)"' as $created_title |."en-export".note[]|=(. + if .title == "Untitled Note" then {title:$created_title[.created]} else {} end )' ~/Downloads/yarle-run/input/evernote.toc/all/Blog.enex > Blog.fixed.enex
speedrun
prep dates from “Untitled Note”
xq -r '."en-export".note[] | {title,created} |select(.title == "Untitled Note") | .created' ~/Downloads/yarle-run/input/evernote.toc/all/Blog.enex | sort > dates.untitled.txt
verify that there are no duplicates
cat dates.untitled.txt | sort | uniq -c
prep created_title.json
- Blog.json from https://github.com/TjenWellens/evernote-toc-enex
cat Blog.json| jq -r '.notes[] | {title, created: (.created / 1000 ) | strftime("%Y%m%dT%H%M%SZ") } | select(.created as $created | '"$(cat dates.untitled.txt | sed 's/\(.*\)/"\1",/' | tr -d '\n' | sed -e 's/^/[/' -e 's/,$/]/')"' | index($created) )' | jq -sr 'sort_by(.date) | map({(.created|tostring):.title}) | add' |sed -e 's/^ *//' -e 's/: /:/' -e 's/ *$//' | tr -d '\n' > created_title.json
xq --xml-output "$(cat created_title.json)"' as $created_title |."en-export".note[]|=(. + if .title == "Untitled Note" then {title:$created_title[.created]} else {} end )' ~/Downloads/yarle-run/input/evernote.toc/all/Blog.enex > Blog.fixed.enex