My friend Maxime recently blogged about “Programming Without Text Files”. I’ve always wanted to be one of those bloggy people, and being fairly lazy and uncreative I’ve decided to shamelessly purloin the topic for my inaugural post. I figured I’d share some of my own thoughts in this area of “programming without text files” and beyond.
So, what is “programming without text files” anyway? What does it mean for a programmer to “ the underlying data structures directly?” I think that this is probably a subtle distinction, but a potentially important one.
All programming is ultimately done by manipulating an AST – what difference is it if we do it “directly”? After all, IDEs can give us a lot of nice features (autocompletion, autorefactoring, etc) so that we are not always laboriously typing out a textual representation of our program. Isn’t writing code which essentially “generates” an AST and then using all these fancy IDE features conceptually the same thing?
I think both conceptually and practically there are differences. Unless you’re a lisper there is generally a few layers between you and the AST. IDEs have to work hard for these features. They scan, munge, parse, rescan, wash, rinse, and repeat…and then all this work is lost when we send our program off to the compiler/VM/linter/whatever where the cycle repeats.
I also think on a certain level the way many programmers conceptualize programs and the way our tools accept them are very similar – they are a series of letters, numbers, and “punctuation”. For us, it’s translated to an AST almost as an afterthought; it’s just something the compiler does as part of its magic. We catch glimpses of it here and there when we use fancy IDE features like code folding, syntax checking, etc …but it remains obscured by the maya of text files. We still tend to think of programs as some English (or what-have-you) text.
Was the programming work-flow really perfected decades ago? What if we change the way we think about and treat programs? Can we make a language where programming is a first class citizen?
It reminds me of the debate that ensued when Zed Shaw chose to use sqlite databases to store the configuration for Mongrel2 instead of text files. It makes sense – a db is much easier to query and manipulate programmatically. Yet many objected that they were not “human readable”, or worried that their text-focused tools would have trouble dealing with a db.
I could see similar objections here – “If my program is some opaque binary blob of an AST or whatever, how am I supposed to check it into git? How am I supposed to git diff etc?”. I would answer: “Maybe you don’t need to!”
When you start thinking of your programs holistically all this stuff that was “metadata” becomes just data. It’s all a part of your program. You don’t need to worry about cluttering up your pretty little text file anymore; you have a “living, breathing” data structure – you can store revision information for nodes in the nodes themselves if you want to. You can store all sorts of information, and have your program displayed with as much or little detail as you please.
Now that we’re not operating on mere lines of text, your revision control can be aware of the semantics of your language. I can get a revision history of any/all code that touches the I/O lib without a bunch of grep/awk/sh massaging? Yes please!
*.pyc files? F that! Let program and byte code live harmoniously. Let me see the AST and byte code side by side. Let programs easily hack their own binaries if they want. Cache generated code and data in the program file itself, why not?
Tired of slow coworkers who need everything spelled out? Or coders who write comments like Ayn Rand writes novels? We can have granularity in our comments…and no need to come up with some elaborate markup, and then a tool to parse it, and then integrate it with the editor. We already have a model of programming that is very explicitly about this sort of tree-walking/manipulation. Give each comment a “owner”, “level”, etc and flip a switch for “ELI5” or “pithy” comment display. Have a conversation in the comments, heck upvote comments or attach rage faces to bad code.
No need to scroll through hundreds of lines of code you don’t care about, dealing with nonsense like:
Give me an editor that recognizes my code is fractal – none of this “jump to definition” nonsense: I want infinite zoom. Give me a real modal editor – not these piddling “insert” and “command” modes. I want trying-to-understand-code-someone-else-wrote-mode; I want bang-out-code-as-quickly-as-possible-mode; I want debugging-mode; I want writing-documentation-mode; I want security-audit-mode; I want i18n-mode.
In my opinion, this is all just the beginning of really interesting things you could do. I’ve mostly focused on the programmer/user experience, but such a change could make the lives of compiler/tooling authors easier as well.
I don’t pretend to fully understand the trade-offs involved in such a shift – to be sure there are drawbacks. I’m also not about to step up and volunteer to implement it (though I’d certainly offer to help) – it’s quite a task. However, I am suspicious of the idea that we really can’t do better than plain text files.
These ideas have been explored before (smalltalk, lisp, MPS, etc), indeed versions of most of this stuff can be found around …but perhaps the computing world of today is the time and place to start taking them more seriously and revisit them. We live in a world of powerful computers and improved interfaces which could make this more practical.
To be sure, this stuff is not prohibited by our text files – it just becomes easier and more obvious when we discard them. Perhaps it is time we throw down our text shackles, and code free.