UpEnd dev log #0.1 - Frustration and Files
Hello, World! This is what I've been working on. It's about files, and a bit more.
I’ve been putting off writing the first entry in the dev log for quite a while, mostly because it’s hard to get a bearing on which direction to start from in the first place. The project touches on a lot of individual itches I’ve had over a long time, most of which do not have an obvious theme that connects them - or rather, it’s that trying to find the theme is the project. The itches themselves are also small, compact, concrete grievances; and what I propose is somewhat nebulous and aims at nothing smaller than reworking the base way data is accessed through a computer.
I’d like to think I’m on a certain path though, that I’m Inventing on Principle, but it’s articulating the principles that is the hard part, and perhaps what I’m trying to do is to find these principles along the way, one capability at a time.
My hope is that by building a system that allows for a different kind of relating (to) information, and leaving it open-ended in just the right way, the combinatorics of the core mechanics will kick in and “take care of the rest”. Omar1 put it nicely in his tweet, but it wouldn’t be a proper attempt at a computer revolution without a quote from one of the greats:
You get simplicity by finding a slightly more sophisticated building block to build your theories out of.
Alan Kay: Power of Simplicity
The project itself may be complex and certainly difficult to implement, as computer foundations don’t budge easily - but (a new kind of) simplicity is the goal.
In this series of posts, I’d like to outline the problems and motivations that are behind UpEnd, and at the end, also the solution I have come up with, that I believe could address them.
So what is it all about?
Part 1 - Files
It’s about files and the filesystem, data and databases, and personal computing. The venerable file in a tree is one of the most successful building blocks of personal computing we have. Beware those who seek to take away your files! They are one of the few ways we have left to actually take control over data and inspect the “insides” of a computer, and it’s not a coincidence that the platforms with the least amount of user control all try to hide files, in one way or another. Conversely, the oft-repeated (and almost sacred) “everything is a file” philosophy of UNIX means that an operator can just about do anything on the computer with a handful of simple tools. Recently, Omar’s TabFS project, which takes the browser and essentially turns it inside out by exposing everything from input fields to image files to the filesystem, applies that to arguably the most powerful and ubiquitous platform currently in existence - you can now automate everything on the web in just a couple lines of shell, or explore it in Explorer and what have you - the choice is yours, it’s all just files!
However… The “file” on its own isn’t a great abstraction. To be honest, it’s not really even much of an abstraction; as is noted in There is only one OS, and it’s been obsolete for decades…
Let’s look at this fundamental pillar of the Unix philosophy. What, exactly, is a file?
“A stream of bytes”? Really? What sort of a computing abstraction is “everything is a stream of bytes”??
First of all, let’s just realise that handling bytes is an optimisation over handling bits, as is handling words of any size. So the actual content of this is reduced to: “everything is a stream of bits”.
And last time I checked, everything in computing boils down to bits in the end. So half of it clearly contributes nothing more than what we already know.
Let’s admit it, the “file” is probably the dumbest abstraction conceivable. By which I don’t mean to cast any judgment! Dumb metaphors are often simple, and simple is very often good; but as the Kay quote outlines, you can actually get more simplicity by (even slightly!) upgrading your bedrock. And since the file is quite literally just “a bunch of bytes with a name”, I think we have a bit more room to go concerning sophistication.
Really, in “everything is a file”, it’s the “everything is a” part that does most of the work, rather than the “file” part.
Having consistency and generalizability in the way you interact with a computer is huge. It means that you don’t need to prepare for every possibility in advance, and that you can compose different actions and modalities.2 Once you have a “file”, it can be copied, e-mailed, zipped… Imagine if every single one of those would have to be coded for separately (kind of like it used to be/still is on smartphones, cough).
However… what do “copying, e-mailing, zipping” have all in common? That’s right, it’s all really just copying in different guises. Because that’s about all you can do with a stream of data, not knowing anything else about it. Because that’s what a “file” fundamentally is.
But a file never really is meaningfully just a stream of data, except for two cases:
You’re copying it from one place to another.
In literally every other scenario, you need to interpret the file somehow, in order for it to be anyhow useful. This means (potentially) reading the header and parsing it into its constituents in the better case, or “parsing” it in ad-hoc brittle ways because it’s just text like UNIX does, in the worse case; but in either case, the structure is there, it’s just hidden behind the almighty abstraction of “just a bunch of bytes”. This is one of the key arguments of the Programmer's critique of missing structure of operating systems (a piece that played a huge role in me attempting to build UpEnd at all).
The gist is: the data does have meaning, but that meaning simply cannot be expressed with the general tools that we have for dealing with it. So it remains unexpressed, or instead, it’s scattered - across formats, tools and different “apps”, it only appears if we look at the data in the right way, with the right tool on the right machine. This meaning, put into machine form, can be broadly called “metadata”, and its omission from the base layer of the operating system is one of the key things UpEnd is meant to address.
This includes everything from ID3 tags to EXIF, but there’s really no reason to stop there; this is just one form of metadata that was, evidently, too important not to be implemented as part of the format itself (since that was the only option, really) - but everywhere you look, data has some sorts of attached information - ratings, tags, notes, comments, dates, sizes, colours, flavours…
And there is even more of what isn’t really there at all! Text documents reference each other all the time, but they link to each other in the most cumbersome ways, and the real interconnections remain obscured; music does not exist in a vacuum either, but there’s no field for “reminds me of”, or “goes well together with”. The next post, then, will focus on the context of data.
Whose twitter account is honestly just an endless well of inspiration.
Really, it’s the modal vs modeless interfaces debate all over again.