SmackerNews

πFS

881 points · 196 comments · 20 hours ago · helterskelter

github.com

jamwise18 hours ago
Reminds me of when I tried to use the library of babel as a data compression tool. It led me down a fun rabbit hole and was my first introduction to information theory.
The conclusion being that you basically need the same amount of data to represent the address of your data as the data itself, so it's not really effective at compression, just a fun thought experiment.
The cool part of this in modern times is that LLMs are basically a form of lossy compression that actually achieves the gist of what these tools fail at. Although it is lossy, and requires a massive substrate. This is related to the idea of AI/LLMs being a form of language compression.
dang18 hours ago
Related. Others?
πfs – A data-free filesystem - https://smackernews.com/item/36357466 HN - June 2023 (107 comments)
πfs – A data-free filesystem - https://smackernews.com/item/28699499 HN - Sept 2021 (30 comments)
PiFS – The Data-Free Filesystem - https://smackernews.com/item/26208704 HN - Feb 2021 (1 comment)
Πfs: Never worry about data again - https://smackernews.com/item/21359338 HN - Oct 2019 (1 comment)
The π Filesystem for FUSE: Store Your Data in π - https://smackernews.com/item/19223032 HN - Feb 2019 (1 comment)
pifs - Avoid disk space usage by saving your files in the digits of Pi - https://smackernews.com/item/18687275 HN - Dec 2018 (1 comment)
πfs – A data-free filesystem - https://smackernews.com/item/13869691 HN - March 2017 (105 comments)
Πfs: Stores your data in π - https://smackernews.com/item/10856108 HN - Jan 2016 (1 comment)
Πfs: Never worry about data again - https://smackernews.com/item/10847693 HN - Jan 2016 (1 comment)
File system that stores location of file in Pi - https://smackernews.com/item/8018818 HN - July 2014 (98 comments)
100% Compression Using Pi - https://smackernews.com/item/6698852 HN - Nov 2013 (32 comments)
(Reposts are fine after a year or so; links to past threads are just to satisfy extra-curious readers)
emptyroads17 hours ago
Reminds me of nsafs, the National Security Agency Filesystem ("free" because the government pays for it) - https://github.com/freedomtools/nsafs
adzm18 hours ago
It is worth noting that as the length of data increases it becomes extremely unlikely that the index and length of the sequence within pi would actually be smaller than the data.
MisterTea18 hours ago
Reminds me of: https://www.spronck.net/sloot.html
Further reading: https://en.wikipedia.org/wiki/Sloot_Digital_Coding_System
mkesper7 hours ago
Outdated! Should have linked directly to https://github.com/philipl/inferencefs/ obviously.
windward16 hours ago
One of the properties that π is conjectured to have is that it is normal
conjectured
Glad to see one of my pet points of pedantry come up. No non-constructed irrational number has never been proven to be normal or disjunctive.
utopiah4 hours ago
"This file doesn't look like what I remember.
Are you sure? It's been a while since you last opened it. Memory is funny like that. The file is fine — maybe take another look with fresh eyes."
from https://github.com/philipl/inferencefs/
Maybe I do not indeed remember properly. Anyway, back to watching "Eternal Sunshine of the Spotless Mind" for the first time, I think.
bobim18 hours ago
This is disturbing to realize that pi then contains all the past and future knowledge, including when I'll pass away.
vbarrielle7 hours ago
This would be easier using the Champernowne constant (https://en.wikipedia.org/wiki/Champernowne_constant) which is guaranteed to be normal, not just conjectured.
aidenn018 hours ago
I vaguely remember an entry to a compression-benchmark that gamed the benchmark by treating the filename as part of the input to the decompression-algorithm, thus beating the metric that only measured the size of the file.
layer815 hours ago
In this implementation, to maximise performance, we consider each individual byte of the file separately, and look it up in π.
Considering each individual bit separately would be even more performant: you only need the indexes 2 and 33, and there is an efficient mapping of those to the bits in storage.
nyc_pizzadev16 hours ago
Just a heads up, this is writing 16 bits for every 8 bits of input:
https://github.com/philipl/pifs/blob/fded8bf7b8f4fc64233e37b...
partsch18 hours ago
Finally, someone is doing something about the rising prices of storage!
hnbad25 minutes ago
So this is how I find out that in Verdana lowercase pi looks exactly like lowercase Cyrillic п (pe), i.e. like an open rectangle rather than a bit curvy.
hnlmorg18 hours ago
This is probably a dumb question, but do we actually know that pi has an infinite number of decimal digits or are we assuming that it does because we haven’t developed a sufficiently powerful computer to calculate the last digit of pi?
I’m guessing this is something that could be formally proven?
baalimago8 hours ago
This got me thinking about the "simulation theory":
If our universe is simulated, it must be possible to snapshot the entire state for one iteration (however time now is quantized, open question). "... From here, it is a small leap to see that if π contains all possible files, why are we wasting exabytes of space storing those files, when we could just look them up in π!" (from pifs, above)
This means that not only does a singular snapshot of our universe exists in pi, but every single one does
The information for our entire universe's simulation is stored in pi (and every other number like it)
Lalabadie19 hours ago
Love it! This feels very much in the spirit of Tom7's Harder Drive [1]
[1] https://www.youtube.com/watch?v=JcJSW7Rprio
thangalin18 hours ago
https://cs.stackexchange.com/a/53737/1704
Matches that occur early enough in π to attain significant compression will not be varied. That is, it isn't possible to use π to compress interesting, real-world data because real-word strings are unlikely to arise early.
giancarlostoro18 hours ago
I... I can't tell if this is an elaborate troll or pure genius. I love it.
torh6 hours ago
No thanks, I have all the files I need right here in /dev/urandom.
notatyrannosaur7 hours ago
Reminds me of https://en.wikipedia.org/wiki/MS_Fnd_in_a_Lbry
Meta: every single comment seems to start with some variation of "Reminds me of". Had to get mine in.
golem147 hours ago
This isn't really going far enough; the readme says - keep the metadata on a piece of paper or whatever. But: The metadata is data too, you can find it ALSO within \pi. So it's \pi all the way down.
Not even sure if there an interesting Collatz-like conjecture here.
koolala18 hours ago
Short Storage Number - SSN
0x123456789ABCDEF0
use this number as a shorter nibble storage alternative...
sam_goody50 minutes ago
Theoretically, if we had a GPU so fast it could instantly calculate billions of digits of Pi, and a small hard drive, could this actually be made to work?
Cache all the last lookups but otherwise just store the index within pi? And for larger files - split them into chunks of whatever size could be handled?
(I mean, I realize this is a joke and can't make sense - but GPUs can be really really fast, and am willing to make a fool of myself by asking.)
And if we had a quantum computer that stores all of pi on one qubit, that could make things even faster ;/
outadoc7 hours ago
Reminded me of PortalRunner's latest video: https://www.youtube.com/watch?v=w6rkhvdAqHU
tptacek19 hours ago
https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...
adzm18 hours ago
I'm intrigued that π was capitalized to Π presumably automatically in the HN headline.
z3t416 hours ago
Someone should make a service "where in the pi am I" then you could use it as a short link. Then there will be hardware accelerated pi chips. All computers will come with pi preinstalled.
yason5 hours ago
Where do you store the indices? Blockchain!
amelius7 hours ago
I am curious what this means for copyright. I.e. if all music/songs were already encoded in Pi even before the universe started existing.
amelius7 hours ago
Instead of using Pi wouldn't it be better to choose a number for which the conjecture is true?
And for which the index is easy to compute?
keithnz15 hours ago
isn't this relying on properties that aren't proven about pi? it needs to be disjunctive or normal, and neither of those are proven
chris_sn16 hours ago
Funnily enough I’m reading Service Model and just got to the bit in the Library Archive, which has a very similar vibe to this project. Love it
charles_f18 hours ago
Posted many times before: https://news.ycombinator.com/from?site=github.com/philipl
My favourite issue being about GDPR compliance https://github.com/philipl/pifs/issues/56
mohsen17 hours ago
If you think about it, a piano has all the possible songs in it too!
markcollins057 hours ago
Technically π being normal is still unproven. So if the conjecture is false this whole thing falls apart. But that's what makes it a perfect nerd joke.
psadri10 hours ago
This is part of the plot in Murakamui's Hard Boiled Wonderland and the End of the World.
actusual17 hours ago
This is why I got pi tattooed. It's a tattoo of all tattoos.
yassi_dev16 hours ago
I built something with a similar spirit for Pi day: https://pi.yassi.dev/
glitchc18 hours ago
At what point is the metadata larger than the actual file?
bilsbie16 hours ago
I’d guess even the index in pi for my phone number would be more digits than the phone number.
So not really a compression scheme.
amluto17 hours ago
Why is this thing so slow? It took me five minutes to store a 400 line text file!
Well, this is just an initial prototype, and don't worry, there's always Moore's law!
Seriously? They're only storing individual bytes in pi:
In this implementation, to maximise performance, we consider each individual byte of the file separately, and look it up in π.
So the whole transformation should be trivially reducible to a 256-element lookup table from source byte to location in pi and a similar table used to convert back the other way. Maybe a fancy formula could be used for the (never actually encountered) case in which a byte is encoded by one of the infinite available noncanonical encodings.
woah16 hours ago
I've simplified it and made it more flexible
3._1_415926535897932384626433832795_0_288419716939
0x1ceb00da6 hours ago
The design is very human
liamYC12 hours ago
Developed a UI with Claude here:
https://ljsimpkin.github.io/pi-compress
It really shows how inefficient such a compression would be. Haha nice idea
ctan413 hours ago
μῆνιν ἄειδε:
Sing, the wrath. Rendering in LaTeX.
[1]: https://smackernews.com/item/48010729 HN
keyle12 hours ago
Note, this (2012)
[deleted]
adamwright3267 hours ago
The metadata storage problem is the real punchline here. You end up needing more space for the metadata than the original data, so it's a zero-sum joke.
dofcof8 hours ago
This is a classic
anon29117 hours ago
It is actually not proven that the decimal expansion (or any rational base expansion) of pi contains all possible sequences of numbers. It sounds like it intuitively would be since the expansion is infinite, but it is not necessarily true. For example, the number 0.101001... (i.e., decimal formed by concatenating N zeros and then 1 for all N 0 to infinity) is infinite, never-ending, and irrational but does not contain every sequence of numbers.
[deleted]
Levitating19 hours ago
absolutely genius
stogot13 hours ago
Has there been attempts to prove the conjecture?
j3th9n17 hours ago
Why would anyone need πfs, since you can already build such a system yourself quite trivially on Linux.
leephillips18 hours ago
What a brilliant idea! Of course, of course, it’s not in the repository so I can’t apt-get install it. Debian...always so far behind.
dwheeler15 hours ago
Horrible. Brilliant. Love it.
mzelling17 hours ago
Looked at the repo but it says NOTHING about what value this project offers.
I mean, I get that it's "fun" to store information within the digits of pi. But is this just amusement, or is there a value prop for production use here?
(Speaking as a math major, by the way. I'm sympathetic to the cause.)
spchampion217 hours ago
This is interesting, but I feel like my use cases would better align with a different irrational number. Could I get an option to do this with e instead? /s

news.ycombinator.com/item?id=48480978