Reading

Any time I need to answer a question about REST, I end up spending 2-3 hours reading. I’m not sure if that’s a good thing or a bad thing… anyway, while doing some of that reading I was on InfoQ at this post, and that lead to this post, and that lead to this post, and at the end of the post the guy said:

So run away screaming from anyone who dismisses these as “just implementation details” – implementation details matter. They cost money, and break hearts.

And it’s an excellent quote. I’ve still got this page, which looks like its mime-type is wrong, and this wikipedia page, because it looks interesting, and this website, because it’s what inspired Sam Ruby to get off his SOAPbox to take a little REST (haha I’m so clever) and this article because it seriously sounds like an insanely useful read. I might save it for another day. This is a good time to just toss this in here, which should have been included yesterday, but wasn’t. It’s about the javascript event model, which prior to reading that article, I had never really considered the existence of, and it’s made so much magic plain in my mind. quirksmode and infoQ are really excellent websites, by the way.

Echoing yesterday’s words, every time I look into the internet, it grows deeper. I can’t help but wonder about whether there’s something worthy hidden in rpc-style communication, but I’ve heard the horror stories, and I know that no technology is a panacea, even if working with JSON tends to feel an order of magnitude less painful than working with XML, and all the big SOAP things I’ve worked with were XML-based.

Also, I completed day 21 of blogging, yesterday. My last two posts kind of sucked. Well, no, they really sucked. They’re just me rambling! I need to have a focus and a plan in the future, or at least a bunch of exposition and links like I’ve got here.

What’s next

My plan for the next 21 days is ambitious, and maybe a little reckless. I’m going to be speed reading. Here’s how it breaks down, with each phase lasting a full week:

Phase 1: Read at least 10 pages of at least 2 books, every day.
Phase 2: Read at least 20 pages of at least 3 books, every day.
Phase 3: Read at least 30 pages of at least 4 books, every day.

I’m going to review this plan as I go forward, because I have absolutely no idea how realistic it is. It ends with me reading 120 pages a day, which at the moment feels totally outside of my ability to fit into my life. It starts with reading 20 pages a day, which feels doable, but like it’ll take effort. If I’m going to reach a point where I’m burning through 120 pages of material a day, then I think I’ll have to be speed reading. I’ll definitely need to break 200 words per minute — let’s make some estimates:

A book has on average 50 lines per page (I know, most have 40, but I’m going to be reading the Feynman Lectures on Physics as part of this, which have ~60 lines of text per page), and has, on average, about 14 words per line (again I’m pulling this up from 12ish because the Feynman book averages about 16). So we’ve got 50 * 14 = 700 words per page. 20*700 = 14000 words, and at 200 wpm that means 70 minutes of reading. To start. On the high end, 120 pages, we’ve got 6 times as much, which means 420 minutes of reading. That’s insane.

If I can get up to 500 wpm, I can cut 70 minutes down to 28 minutes, and 420 down to 168. I think I can manage to read for 3 hours in a day, but seven is impossible for me. If I can get up to 800 words per minute (my dream goal), then I could read 120 pages in just 105 minutes. Just under 2 hours of solid reading every day. Scary, but I’d love to accomplish it.

Today, I’ll just read my normal speed on whatever books I have around two of (The Illiad, Red Mars, Feynman, and Walden). Starting tomorrow I’ll grab a book that I’ve read previously so that I can focus on speed without worrying about my loss of comprehension. My other book will be Stand on Zanzibar, which has a “stream of content” delivery, and may be best consumed at a lightning-fast pace with slight misunderstanding. Blog posts for the next 21 days will be responses to or summaries of the content I’ve read; they may be done with a day’s lag. Not sure yet! And as usual, I’ll occasionally leave updates with regards to the process.

The Web Is Complicated (and schools should do more to teach it)

In the past few weeks, I’ve been reading about some little aspects of web programming that I’d never really considered the existence of, let alone closely examined: for example, the DOM event model. There’s a deep amount of learning to be done here about the history, the problems that are being solved, and the present state of this system. While reading and researching, I was reminded of a discussion I tried to have with the School of Computer Science’s faculty during a planning retreat, back when I was a student representative. I do not remember the specifics very well because it was years ago, but it was along these lines:

Chair: (to all present) “Are there any other suggestions for areas of improvement in our teaching?”
Me: “I think we should do more to teach people about making applications for the internet — there’s a lot of deep information there, and we don’t really teach it. Sessions and login/password handling, HTTP verbs, caching, CGI, how web frameworks like Rails work — academically, we seem to live in a world where the internet doesn’t exist.” (only I was probably way more awkward)
Chair: “One issue with that is that we’re not in the business of teaching technologies; we teach concepts, and students are expected to be able to figure technologies out on their own”
A Professor: “I include some slides about HTTP when I teach my (optional) 4th year networking course” (which requires an optional 3rd year networking course)
Another Professor: “I teach students about REST and design principles in my (optional) software design course” (which has a reputation as a course for people who want to do nothing but write essays about design patterns)
Another Professor: “You have to use GET requests for the server we you write in second year — we talk about HTTP verbs then (we talk about GET)”
Chair: “Well, it’s not our job to hold people’s hands and give them every bit of knowledge, especially not about specific technologies. We’re not going to make a Ruby on Rails course. The concepts are being taught, you just seem to have missed them”

I am probably editorializing too much, because everyone there was reasonable and (mostly) friendly, but I was given a distinct sense of “learner’s machismo”, that if I don’t know the answer to a question it’s because I’m not trying hard enough, not because there’s anything wrong. I had spoken to several students and gathered their requests: education on the internet was perceived as sorely lacking, and a very useful thing to add.

There is a new stream in the School which is called “Software Engineering”, as opposed to the generic Computer Science program. It sounds like they’re doing some more hands-on work in it than we did in the CS stream, so perhaps new students will have an opportunity to learn these sorts of things. It’s difficult for universities to face the topic of “abstract vs applied knowledge”, especially in the realm of Computer Science, because both sides have valid arguments.

My experience and training at school was not to be a competent programmer. I really didn’t program much over the 4 years at all — probably less than half as much as friends of mine who went through 2-year college Software Engineering programs. And what’s worse, code quality was worth almost nothing throughout my time in school: deadlines were hard and fast and we were typically marked on correctness and presence of documentation. Almost no time was spent reading code or talking about how to write good code, or to design good code. It seemed like the portions of our education that most closely related were talks about coupling, cohesion, UML and sequence diagrams.

Colleges are supplying a high level of technical knowledge for our field, and the candidates seem less confident that they know theory, but appear to be very well versed in it. In my experience, University students seem to be half way between good programmers and good researchers, without being either. Well, now I’ve wandered off topic. Code quality would be a great thing to try to teach, but for now, I want to reiterate that the web is complex, and we need to have more information about creating applications for it included in our education.

House is over

I just finished catching up on the final season of house. Spoilers: dinosaurs conquer the earth and convert it to a day spa.

I haven’t faithfully watched House for the last 8 years. I actually hadn’t seen more than the first season or two before last summer – but it’s had a constant presence. Several of my long-time friends have been fanatics about this show for longer than I’ve known them. It would regularly come on in the background while I did homework. House has been a commonplace meme on the internet. The finales are always held up as hallowed mindfucks. And I did catch the odd episode — I even tried to catch up once when there were only 4 seasons or so, and I only got midway through the second season before throwing my hands up in the air, proclaiming “I can’t watch a show where every single episode is the same!”

Almost two years ago, a friend of mine (one of those fanatics) was watching back through it. Pretty much every time I walked into his room, he’d have House playing, no matter what else was going on or what time of day it was. I would often just sit down on his bed and watch some of it, because it seemed so ridiculously entertaining. The one thing I could consistently count on was a really high production value. Despite the content seeming mundane, it always felt good to watch.

Last summer, when I should have been exploring San Francisco, I instead made a habit of laying on my bunk bed and watching episode after episode of House. I started at Season 4, in which House runs a competition to see who will be on his team, and gradually whittles down a long list of candidates in humorous ways. The clearly developing story helped to wipe away the sense of monotony that I came to hate in my previous attempt. Those episodes successfully held my attention until I got hooked! I stopped midway through the sixth season, wanting to leave some of the show for the future (I do this a lot — for example, I’ve still got a missing season of Scrubs to watch), and I’ve occasionally done little bursts, aiming catch up not long after the finale.

That must have been boring for you — my apologies! As a part of finishing the show, I wanted to recount my experience with it. Every time you finish a series of novels or television show, there’s a time similar to a period of grieving. You can go back and rewatch the characters play out old interactions, but it’s more like high-fi memory than re-experiencing the things you first saw.

We place value on the stories that connect us to events like television shows or reading books. As we spend more time with them, we become more dedicated, and we pile up rich memories surrounding them — often more about our interactions with others than about them. For many people, their love of Harry Potter, Star Wars, Twilight, or Lord of the Rings isn’t really a love of the thing itself, but a love of the feelings they had and the people they knew while they experienced it. Media is an excellent source of nostalgia.

Anyway, here’s to you, House.

Basics of a Python Web Stack

When it comes to web development, you could fill warehouses with the stuff I don’t know. Five years ago, I knew a certain amount less than I do now — but there was also dramatically less to know. Reading this post (hn) today helped me to realize that if I do not take action, I may at some point look back at this point as the zenith of my knowledge, relative to what it takes to get a web application up and running.

As such, I’ve spent the last hour or two getting reacquainted with heroku (which, holy damn, heroku is so easy to use). I’ve followed their python tutorial and installed flask and whatnot, and realized that I’m very fortunate to already grok virtualenv, pip, flask, and gunicorn. The heroku docs do an excellent job of explaining how to set things up, if not what precisely they are. So, I decided to do some gap filling — here’s a brief overview of what these things are and why someone would use them:

virtualenv

Virtual Environments are used to create a python environment which is isolated from the rest of the machine. You can install packages into it and easily pack it up to move elsewhere, or duplicate it if needed. I can think of 5 things virtualenv makes it easy to do:

  1. Run multiple versions of python on one computer
  2. Avoid ongoing dependency conflicts between your own projects*
  3. Test/play with new libraries without messing other projects up*
  4. Identify requirements when packaging your code up to deploy
  5. Manage installed packages without being root or configuring by hand

* okay, these are the same problem, but they can come up independently of each other.

For ruby folks: think of virtualenv in the same vein as rvm, but with different philosophies. Some folks on hackernews suggested that an even better analog is rbenv, which I’ve never used — but it sure sounds right!

Typical usage is:

$ ls
$ virtualenv env
$ source env/bin/activate
(env)$ easy_install cool_module
(env)$ vim app.py
(env)$ cat app.py
import cool_module

# code which uses cool_module
...
(env)$ python app.py
# runs successfully
(env)$ deactivate
$ python app.py
# cool_module fails to import

Note the following:

  • On windows, ‘source env/bin/activate’ is ‘env\Scripts\activate.bat’
  • in bash, ‘source’ can be replaced by ‘.’ — it’s common to see ‘. env/bin/activate’ used
  • ‘env’ is just a common name; you can name an environment whatever you’d like
  • there are useful virtualenv switches which can be found by just running ‘virtualenv’, particularly ‘–distribute’, which adds pip by default.
  • ‘deactivate’ is added to your path by virtualenv; it is not a separate program.
  • virtualenv is a program that needs to be installed, either manually, via easy_install, pip, or a package manager. It will be part of future versions of python, but your computer does not have it by default because you have python.
  • there are many different setups of “where you put code” vs “where your env lives”, this is just one.
  • two advantages of not putting your code anywhere inside of your env/ folder are that you can put ‘env’ in your .gitignore, and that if things get messed up somehow, you can have a clean slate python-wise by rm -rf’ing your env directory.

Using virtualenv is very easy to do, and it will quickly become part of the mental model of your working environment. There is some streamlining left to do — ‘. env/bin/activate’ is cumbersome — but it’s a great tool.

pip

I can’t write anything much about pip that hasn’t been written. People have discussed python package management systems very thoroughly! A quick sum-up, then:

Pip is a nice package manager for python. It tracks dependencies, can install/uninstall modules, and can distill a list of requirements (‘pip freeze’), which is both human and machine-readable. Pip can even read from one of its requirements-lists and duplicate an environment. There are apparently things about pip that aren’t great, but I encourage you to go google about them if you care to.

Common usages are: ‘pip install cool_module’, ‘pip uninstall cool_module’, ‘pip freeze’, ‘pip install -r requirements.txt’

Flask

Flask is a web application framework, like Bottle or Sinatra, and to a lesser extent like Django, Rails, or Symfony. It actually started off as a joke! It has a fairly small code footprint and nice documentation. It uses the Jinja templating engine by default and Werkzeug for routing (if you know those things that’s meaningful), and is a very common choice for writing api servers at the moment. It’s pretty darn easy to get started with.

Because it has such a small codebase, it’s pretty easy to hack up and extend if you need to do that kind of thing (you probably don’t), and it’s got a lot of features to allow for modular code. It’s also got a lot of extension stuff available that can fill most of the gaps you might be looking for. If you’re coming from Django, it’s like starting with the minimum and building up to what you need instead of starting with the maximum and wishing you could shed a few pounds.

Rather than outright duplicate the fine documentation work that’s been done, I suggest you take a look at the Flask docs themselves for an introduction to how to use it.

Warnings: Advanced practices are not very well covered by the documentation, and there are some weird practices around importing and using flask extensions due to a funky design decision they made when Flask was young and are still safely backing out of. Occasionally you’ll find extensions written for an “old style” of Flask extension import. The “new style” of extension importing is intended to support both new and old Flask extensions, but it still occasionally breaks. It’s beyond me when it does — I had this happen attempting to use Flask’s OAuth extension.

gunicorn

Ah, now we’re almost completely outside of my sphere of knowledge. Gunicorn stands for “green unicorn”, and it’s a web server. It’s better than the one that comes with flask, and is quite popular. Alternatives are things like Tornado, nginx, or apache.

Before you think “oh, I have apache installed! I’ll just go do that then instead”, be warned: mod_wsgi can be a pretty major pain to set up and diagnose problems for. I’ve heard that mod_passenger is a better way to go, but I haven’t tried it myself. If you can avoid putting python on apache, it’s probably a good idea.

More things

I don’t know anything about front end frameworks like jQuery, Backbone, or Bootstrap, and I haven’t ever written coffeescript. They’re what I’m going to mess around with a bit this weekend — Backbone is just too new for me to have played with yet (or, I feel like it is!), while jQuery’s anonymous callbacks-with-params mess with my head when I try to build a mental model of how information flows through a program. I feel like I’m supposed to “just trust” that information is there, and that’s weird. We’ll see how I feel in a few days!

(Also, so long and thanks for a super towel day)

The Elements of Coding Style (should not matter)

conventions are a flawed solution to the code-style problem

In my personal code, I like to write names_like_this, and Types_Like_This, and CONSTANTS, and I indent by 2 spaces. I put {‘s on new lines for functions, but not for control structures. And whenever I have 2+ single-line ifs, I’m unlikely to bother writing {}’s out.

No programmer needs to be told about the nature of the debates on style in code. To any non-technical person reading this: imagine multiple Sheldon Coopers arguing about an issue more esoteric and asinine than he normally bothers with. Except that for us, it does tend to matter, and here’s two reasons why:

If there isn’t a consistent set of rules about code style, things will be harder to read for everyone, and there will be more mistakes. Picture collaboratively written prose where the authors insist on spelling words in their own unique ways. That’s why we need to choose a single style in the first place.

If the style is not the one you naturally think in, you will face at least some period of difficulty reading and writing it. Imagine if we decided english would be written with capitalization inverted from how it is today, and instead of spaces, we’d use ^ marks. tHINGS^WOULD^LOOK^LIKE^THIS,^AND^PEOPLE^WOULD^RIGHTLY^COMPLAIN. It’s jarring and uncomfortable — but you would eventually get used to it.

The mental and temporal cost of choosing a specific style for a new project, imposing a style on written code, or changing styles in an existing project is huge. Builds break, people hunt for inconsistencies for days, people write code and then groan and rewrite code, people use complicated regular expressions that become time-sinks. Code linters tell us about our follies: but we have to fix them. Each wasted second is small, but it’s a break in focus, and it’s huge across a hundred developers over a year.

I propose a Code Style Switcher

I followed the following line of reasoning:

  1. We can identify elements of code style
  2. We can specify a set of code style conventions
  3. We can formalize a set of conventions
  4. We can automatically detect failures against code conventions
  5. We can recommend style-fixes <- this is about where existing technology ends
  6. We can automatically apply style-fixes
  7. We can read a file in and spit it back out semantically identical, but formatted in a different style
  8. We can make a system where everyone writes and sees their own style

From there, I thought of a few other cool things — given a code corpus:

  1. We can detect code conventions
  2. We can detect personal code conventions
  3. We can detect ‘problem spots’ where inconsistencies arise
  4. We can calculate a minimum-effort code convention for a team, if we want to

The key idea is that every developer could write their own code, in a way that makes sense to them. Every developer would see code that makes sense to them. There would be no build-breaks based upon lint failures, no “boyscouting” to bring non-conformant files you touch up-to-snuff. And no fussing in code reviews about bracket placement or which character is used for whitespace (and the quantities of chosen).

This is a big problem, and it’s silly that we haven’t universally solved it, when it’s exactly the sort of problem computers are good at solving. (it’s basically machine learning search and replace).

Implementation

A proof of concept would likely operate on a very small scale with specific configurable elements —  e.g. {} placement and indentation. A suitable expansion from that point would be to deal with naming conventions (camelCase versus underscores, _names, TypeNames, etc). From there, the horizon point would be an extensible style language capable of defining a style for any parseable token or combination of successive tokens for a given language, as well as the ability to exclude certain tokens from being processed.

Eventually, the switcher would have to be set up as a plugin for an editor, to allow reading/displaying and writing files in a given style. I’ll likely make a stylesheet generator at some point that shows you “your style” given a piece of code and identifies inconsistencies. The underlying implementation will likely be a c library for the sake of portability, but that may come after prototypes written in higher level languages more suited to string processing or with nice native data structures (perl, python, ruby, or js)

Code would likely be stored on the machine in whatever style the author defined for it, but when committed to a repository, it would likely be wise to somehow reconcile the styles. That could be accomplished with post-commit or post-push scripts, but I’d like to think about that problem more. It’s a long ways down the road. I also need to think more about how to get from detecting a failure to applying a fix. The possibility of semantic breaks is very real, and determining “the safe route” may not be tractable for me.

Anyway, now I’ve just got to do it! I’ll add more posts as I make progress — remember though, I’m pretty busy 🙂

Post-script: I hardly even noticed, but I’m on day 18 now. I need to get a “next thing” figured out fast — at the top of my head are speed-reading, reading philosophy and learning math. I do intend to continue blogging every day to increasing my luck surface-area, also to help in attaining/retaining a producer’s spirit.