Show HN: Latex-wc – Word count and word frequency for LaTeX projects

10 points · sethbarrettAU · 2 days ago

I was revising my proposal defense and kept feeling like I was repeating the same term. In a typical LaTeX project split across many .tex files, it’s awkward to get a quick, clean word-frequency view without gluing everything together or counting LaTeX commands/math as “words”.

So I built latex-wc, a small Python CLI that:

- extracts tokens from LaTeX while ignoring common LaTeX “noise” (commands, comments, math, refs/cites, etc.)

- can take a single .tex file or a directory and recursively scan all *.tex files

- prints a combined report once (total words, unique words, top-N frequencies)

Fastest way to try it is `uvx latex-wc [path]` (file or directory). Feedback welcome, especially on edge cases where you think the heuristic filters are too aggressive or not aggressive enough.

gucci-on-fleek5 hours ago
Are you aware of the "texcount" program [0] that's distributed with TeX Live by default?
[0]: https://ctan.org/pkg/texcount?lang=en

mci11 hours ago

  detex "$@" | wc
  detex "$@" | tr -cs '[:alnum:]' '\n' | grep . | tr '[:upper:]' '[:lower:]' | sort | uniq -c | sort -rn

dang2 days ago
We need a link!

news.ycombinator.com/item?id=46865138