random util

some stuff from my history

3881  xq ~/Downloads/yarle-test/input/foo_export_test.enex 'del(.note)'
 3882  xq 'del(.note)' ~/Downloads/yarle-test/input/foo_export_test.enex
 3883  xq 'del(.en-export)' ~/Downloads/yarle-test/input/foo_export_test.enex
 3888  xq 'del(.en-export)' ~/Downloads/yarle-test/input/foo_export_test.enex
 3889  xq 'del(.note)' ~/Downloads/yarle-test/input/foo_export_test.enex
 3890  xq 'del(."en-export".note)' ~/Downloads/yarle-test/input/foo_export_test.enex
 3891  xq 'del(."en-export".note)' ~/Downloads/yarle-test/input/export_test.enex
 3892  xq ~/Downloads/yarle-test/input/export_test.enex
 3893  xq '.' ~/Downloads/yarle-test/input/export_test.enex
 3898  less Blog.enex
➜  ~ ls -l /opt/homebrew/bin/xq
lrwxr-xr-x  1 tjen  admin  32  3 jul 19:25 /opt/homebrew/bin/xq -> ../Cellar/python-yq/3.4.3/bin/xq

strip content from xml

xq 'del(."en-export".note[].content)' ~/Downloads/yarle-run/input/evernote.toc/all/Blog.enex

{ “en-export”: { “@export-date”: “20231213T100406Z”, “@application”: “Evernote”, “@version”: “10.66.3”, “note”: [ { “title”: “Table of Contents”, “created”: “20240704T231604+01:00”, “updated”: “20240704T231604+01:00”, “note-attributes”: { “author”: “evernote-toc”, “source”: “desktop.mac” } }, { “title”: “Conference: SocratesBE 2022”, “created”: “20220710T120501Z”, “updated”: “20231109T160405Z”, “tag”: [ “published”, “i.conference”, “conference.socratesbe22” ], “note-attributes”: { “author”: “Tjen Wellens”, “source”: “desktop.mac” }, “resource”: [ { “data”: { “@encoding”: “base64”,

strip content and resources + keep output xml

xq --xml-output 'del(."en-export".note[].resource, ."en-export".note[].content)' ~/Downloads/yarle-run/input/evernote.toc/all/Blog.enex > blog-without-resources-or-cont
ent.xml

xq on .enex

remove giant ass resource data, store in json for faster access

xq 'del(."en-export".note[].resource)' ~/Downloads/yarle-run/input/evernote.toc/all/Blog.enex | tee all-without-resources.json

show only notes with title “Untitled Note”

jq '."en-export".note[] | select(.title == "Untitled Note")' all-without-resources.json
 > untitled.json

can I use date to track down?

cat untitled.json | jq '.created' | sort | uniq -c

all notes creation date that are not unique

cat all-without-resources.json | jq '."en-export".note[].created' | sort | uniq -c | grep -v "^ *1 "

2 “20180603T122012Z” 2 “20190316T171536Z” 11 “20190718T053000Z” 3 “20200409T100000Z” 2 “20201124T144351Z” 2 “20201224T110246Z” 5 “20210103T140945Z” 2 “20210103T152022Z” 2 “20220714T080621Z” 2 “20220718T112836Z” 7 “20220821T104335Z” 3 “20220821T111151Z” 14 “20220825T090202Z”

cat untitled.json | jq -r '.created' | sort > dates.untitled.txt
cat all-without-resources.json | jq -r '."en-export".note[].created' | sort | uniq -c | grep -v "^ *1 " | sed 's/^ *[0-9][0-9]* //' > dates.all.duplicates.txt

find common lines in both files

comm -12 dates.all.duplicates.txt dates.untitled.txt

20220821T104335Z -> 7 “20220821T104335Z” -> so only one file title I cannot match purely on date, but need to check 7 others

date + evernote_toc metadata

cp ~/dev/tjen/evernote-toc/out/Blog.json ./
cat Blog.json| jq '.notes[] | {title, created: (.created / 1000 ) | strftime("%Y%m%dT%H%M%SZ") }'

{ “title”: “Google IO: Team Geek”, “created”: “20141225T190200Z” } { “title”: “On Martial Arts and knife fighting”, “created”: “20160608T195129Z” } { “title”: “Speed reading”, “created”: “20161005T133346Z” }

filter untitled note titles (partial) get timestamps of missing notes

cat dates.untitled.txt | sed 's/\(.*\)/"\1",/' | tr -d '\n' | sed -e 's/^/[/' -e 's/,$/]/'

[“20130814T193511Z”,…,“20220831T201458Z”] (partial) filter note + title by timestamp

cat Blog.json| jq -r '.notes[] | {title, created: (.created / 1000 ) | strftime("%Y%m%dT%H%M%SZ") } | select(.created as $created | ["20130814T193511Z"] | index($created)  )'

(full) filter untitled note titles, sorted by created time

cat Blog.json| jq -r '.notes[] | {title, created: (.created / 1000 ) | strftime("%Y%m%dT%H%M%SZ") } | select(.created as $created |  '"$(cat dates.untitled.txt | sed 's/\(.*\)/"\1",/' | tr -d '\n' | sed -e 's/^/[/' -e 's/,$/]/')"' | index($created)  )' | jq -s 'sort_by(.date)' |tee missing-titles-with-dates.json

which titles are missing?

cat ~/Downloads/yarle-run/output2/notes/Blog/Table_of_Contents.md | head

delete frontmatter

-e '/---/,/---/d'

delete stuff from line, keep only title

-e 's/^[0-9][0-9]*\. \[\[.*.md|\(.*\)\]\]$/\1/'

get titles from table_of_contents.md

cat ~/Downloads/yarle-run/output2/notes/Blog/Table_of_Contents.md | sed -e '/---/,/---/d' -e 's/^[0-9][0-9]*\. \[\[.*.md|\(.*\)\]\]$/\1/' | sort > toc-titles.txt

get titles from enex.json file (translated some characters to match toc)

jq -r '."en-export".note[].title' all-without-resources.json  | tr ':?"/|' '_' |tr -d '#[]' | sed -e 's/\.*\.$/_/' -e 's/  / /g' | sort > enex-titles.txt

find diffs

diff --width=$COLUMNS --suppress-common-lines --side-by-side --color=always enex-titles.txt toc-titles.txt
diff enex-titles.txt toc-titles.txt | grep "^<" | wc -l
diff enex-titles.txt toc-titles.txt | grep "^>" | wc -l

29 # because note Table_of_Contents was hacked into the enex, but does not contain itself 28

these notes are screwed up

diff enex-titles.txt toc-titles.txt | grep "^<" | sed 's/^> //' | sort |
 uniq -c

1 < Table of Contents 28 < Untitled Note

These are the titles that should have been there

diff enex-titles.txt toc-titles.txt | grep "^>" | sed 's/^> //'

Article_ We are creators - André Chaperon, Shawn Twing Blog topic categories Book_ The Art of Empathy - Karla McLaren Book_ The Start-Up J Curve - Howard Love Book_ This is LEAN - Niklas Modig & Par Ahlstrom Can we use an Andon Cord in programming_ Changing culture by selecting Hero Stories Coding Pricinples (zooming patterns) Course_ Cloud Native Entrepreneur - Patrick Lee Scott IMG_20220729_193455.jpeg Metaphor_ Catching the big fish vs learning to fish Model_ 3 Axes of software development Model_ 3X - software stages Model_ Action Requiring Neurological Program Model_ Grouping tests Model_ Hypothesis-driven delivery Model_ incremental vs iterative Model_ naming tests Model_ primary vs secondary needs Model_ tests must clearly express required functionality Model_ workgroup vs team Opinion_ Where there is blame, there is no learning Opinion_ todo - doing - done is too limited TDD styles books The 7 Habits of Highly Effective People Video_ Code as Risk • Kevlin Henney Video_ The Infinite Game_ How to Lead in the 21st Century - Simon Sinek Video_ The Secret Assumption of Agile - Fred George

replace inline with xq

WIP fragment

cat blog-without-resources-or-content.xml | xq '{"20240704T231604+01:00":"foo","20220710T120501Z":"bar"} as $created_title |."en-export".note[]|=(. + {title: (if $created_title[.created] then $created_title[.created] else .created end ) } )' | head -n 20

WIP step prep created_title.json

cat Blog.json| jq -r '.notes[] | {title, created: (.created / 1000 ) | strftime("%Y%m%dT%H%M%SZ") } | select(.created as $created |  '"$(cat dates.untitled.txt | sed 's/\(.*\)/"\1",/' | tr -d '\n' | sed -e 's/^/[/' -e 's/,$/]/')"' | index($created)  )' | jq -sr 'sort_by(.date) | map({(.created|tostring):.title}) | add' |sed -e 's/^ *//' -e 's/: /:/' -e 's/ *$//' | tr -d '\n' > created_title.json

WIP done

cat blog-without-resources-or-content.xml | xq ''"$(cat created_title.json)"' as $created_title |."en-export".note[]|=(. + {title: (if (.title == "Untitled Note") then $created_title[.created] else .title end ) } )' | less

nice about this method (check if untitled note, and only then do lookup)

  • does not matter if other notes have the same date
  • as long as the notes with “untitled note” have unique dates it’s fine!

WIP speedup

cat blog-without-resources-or-content.xml | xq "$(cat created_title.json)"' as $created_title |."en-export".note[]|=(. + if .title == "Untitled Note" then {title:$created_title[.created]} else {} end )' | less

full solution keeping xml

xq --xml-output "$(cat created_title.json)"' as $created_title |."en-export".note[]|=(. + if .title == "Untitled Note" then {title:$created_title[.created]} else {} end )' ~/Downloads/yarle-run/input/evernote.toc/all/Blog.enex > Blog.fixed.enex

speedrun

prep dates from “Untitled Note”

xq -r '."en-export".note[] | {title,created} |select(.title == "Untitled Note") | .created' ~/Downloads/yarle-run/input/evernote.toc/all/Blog.enex | sort > dates.untitled.txt

verify that there are no duplicates

cat dates.untitled.txt | sort | uniq -c

prep created_title.json

cat Blog.json| jq -r '.notes[] | {title, created: (.created / 1000 ) | strftime("%Y%m%dT%H%M%SZ") } | select(.created as $created |  '"$(cat dates.untitled.txt | sed 's/\(.*\)/"\1",/' | tr -d '\n' | sed -e 's/^/[/' -e 's/,$/]/')"' | index($created)  )' | jq -sr 'sort_by(.date) | map({(.created|tostring):.title}) | add' |sed -e 's/^ *//' -e 's/: /:/' -e 's/ *$//' | tr -d '\n' > created_title.json
xq --xml-output "$(cat created_title.json)"' as $created_title |."en-export".note[]|=(. + if .title == "Untitled Note" then {title:$created_title[.created]} else {} end )' ~/Downloads/yarle-run/input/evernote.toc/all/Blog.enex > Blog.fixed.enex