Wrap Comments and Text to Column Width in IntelliJ Editors

One of the small annoyances I found after switching to IntelliJ text editors recently was that the editor won’t reformat selected text to your chosen column width. Annoying, for Vim and Emacs users!

I corrected this by writing my first IntelliJ plugin: Wrap to Column, which is a port of a different plugin I wrote for the same feature in Sublime Text 2.

My “Wrap to column” command wraps selected text or the current line to the column width you’ve configured for the project. The goal is to match the functionality of Vim’s “reformat lines” (gq) command and “fill paragraph” in Emacs.

Just so you know, there is a “fill paragraph” command in IntelliJ — something they added in recent months — but I’ve not been able to get it to do anything but merge all selected text into a single line that is longer than my column width setting. After filing bug reports and sending emails, I gave up and wrote the feature myself, since I’d already done so once before.

If you end up using it, drop me a line or comment on the plugin page if it’s working for you. I could use more testers!

Build an In-Memory Proximity Search Web Service with Python, SciPy and Heroku

In this post I’m going to look at a concrete example of building an in-memory proximity (aka, nearest neighbor) search web service using Python, SciPy and Heroku.

Later we can speculate on use cases for this approach as opposed to a geo-aware database.

Define Our Terms

So, let’s define our terms:

  • In-memory: The web service process contains the data we will query.
  • Proximity search: Given a latitude/longitude coordinate, return a set of results within a fixed distance from that location. Also called nearest neighbor search, closest point search, etc. I prefer “proximity search.”
  • Web service: This will be a JSON web service.
  • Python: We’ll use Python 2.7.
  • SciPy: We’ll use a C component of SciPy to do the search, namely scipy.spatial.cKDTree.
  • Heroku: We’ll deploy the project on Heroku using a custom Python build-pack to install SciPy.

The Example Project

All of the code I’ll discuss and quote is available in an example project on Github.

The proximity search that the example performs looks up statistics (nothing fancy now, just sums) about crimes that occurred in Portland, Oregon near a given location.

Also, a warning: this is by no means an attempt at production-ready code.

The Secret Sauce: SciPy’s K-D Tree

The fastest way to do a proximity search lookup in Python that I could find was SciPy’s implementation of a k-d tree. For more information, check the Wikipedia article.

In short, a k-d tree is a binary space partitioning tree, and SciPy’s C implementation is pretty fast. Here are the docs for the code we’ll use.

The class is pretty simple. According to the docs, we load in some data and get a query method that we can use to perform a nearest-neighbor search.

Building the Proximity Search

So, let’s look at some example code that loads up the k-d tree. Then we’ll look at code that performs the query.

I’ve simplified part of a class I used in the example project to do this. The source is available on GitHub.

Here’s a Gist of the code we’ll look at first:

We have an __init__ method that creates a self.crimes object.

Let’s assume this code loads a file that contains crime data tagged with Mercator coordinates into a dict whose keys are coordinates and values are an array of crimes that occurred at that location. (Presumably you also have a file of geo-tagged data that you wish to offer a proximity search web service for.)

On line 13 we create the k-d tree of crime locations. What we’re doing here is taking the coordinates of all known crimes (not the crime objects — just the coordinates, which are stored as keys in the self.crimes dict) and passing them into the scipy.spatial.cKDTree constructor. The cKDTree builds an index of the coordinates.

Next we have a get_points_nearby method that performs the nearest-neighbor(s) query against the k-d tree. The call to query is on line 24.

We sent a coordinate into query and we get back a tuple containing the distances and indices of nearest neighbors within the maximum distance that we supplied (in this case, 1/2 a mile).

That’s the meat of the proximity search, just passing the buck to SciPy — we now have our coordinates and we can look up in self.crimes the actual crime data that map to those coordinates.

Creating the Web Service

Assuming your source data is already in latitude and longitude form, you can already use cKDTree in the fashion we’ve looked at to do proximity search. Now we just need to wrap it up as a web service.

The following is an example using Flask because it’s a pretty easy framework to deploy to Heroku. I’ve edited it to remove only a couple of lines from the real file.

This file defines two web services, one available at /crimes/stats/<longitude>,<latitude> and one at /crimes/<longitude>,<latitude>.

Assuming that the PortlandCrimeTracker object is capable of giving us back sums by category for crimes discovered near a coordinate, the rest of the work done in these services is ceremony: get_point tries to obtain a coordinate from the current request, and if it fails, causes Flask to return a 400 status code for the request. Meanwhile, get_crimes passes a valid coordinate and any GET parameters found in the request to PortlandCrimeTracker.get_crimes_nearby, which returns data on crimes near the coordinate.

Deploying SciPy on Heroku

The trick to deploying SciPy on Heroku is using a custom buildpack. Fortunately, someone already creating one of these for this purpose. Some details about using it are available in this Stack Overflow comment.

I forked the buildpack for the sole purpose of pointing all of the repo URLs at my GitHub account.

Performance

Deploying this code onto Heroku with a free account (1 dyno) using no cache, Gunicorn and two workers got me an average of 98 requests per second after around 6000 requests.

Why not PostGIS?

Well, for one, to have fun!

I like the idea of doing an in-memory search more than storing geo-data in a database when the dataset is frozen or it changes at regular and predictable intervals. So, I wouldn’t use this for an application that made user-entered locations searchable. In that case I would probably use PostGIS.

That said, we swapped a PostGIS dependency for a SciPy dependency, and with Heroku that turns out to be less than straightforward to deploy.

Revisiting Umberto Eco’s Future of the Book

In 2003 Umberto Eco gave a talk at the opening of the Bibliotheca Alexandrina titled, “Vegetal and mineral memory: The future of books” that is interesting to read again now, nine years later.

His goal in the talk was to break apart the fear people had about “the future of books.” He split this fear into two parts: anxiety about the future of the physical artifact — the bound book — and a concern that what people usually read in them was changing.

“Good news: books will remain indispensable,” Eco wrote in 2003, and I agreed with him. The idea that anything could replace paper seemed ridiculous. Yet now, reading an ebook is enjoyable, convenient and becoming more and more affordable every fiscal quarter. I see them at coffee shops, on the bus and in my home all the time — and that doesn’t address the use of phones to read, either. No, the experience is not the same as with physical books. There are no “used” ebooks for purchase; we cannot easily give them to friends after reading; we never own them in the same sense. Still, just as CDs and tapes have fallen away because of the utility of digital copies, there are many benefits to ebooks, enough that I may prefer them over physical books — something I thought impossible in 2003. (Every time I pick up an actual paper book, though, I’m reminded of what I miss about them, so the jury is still out on whether digital can truly replace paper books for me.)

The novel as an art form would stick around, he claimed. Preposterous, I thought! How quaint! Of course hypertext would somehow replace the novel — that, or story-based video games. Linking and forking paths seemed more appropriate to the future than the old frozen novel. However, I’ve come to think the opposite. Truly, nothing can replace the novel, or at least the central aspect of the novel, which is our inability as readers to change the story. This in short the reason he gave in his talk: the novel mirrors human reality. Our past is frozen, and that is the source of all tragedy, isn’t it? We can’t change what happened. Novels communicate with our experience as humans on this deep and penetrating level of reality. In video games and in the theoretical “hypertext” novel (theoretical because none have been done well, right?), the reader may choose what happens, and more importantly the reader is always free to do things over. That isn’t what life is like. You may be able to try again, but maybe not. Much like a novel, we can only analyze and learn from the past.

So I look forward to reading Eco’s talk again in 2022 or 2023 (maybe let’s make it a round 10 year check-in). Human beings are remarkably effective at creating art forms that speak to our current reality, so I’m excited to see what we dream up to reflect our new highly-connected, internetworked selves.

Set an ImageField path in Django manually

Apparently this is a confusing topic. Let’s say you have a Django model with an ImageField and some existing media files, and you’d like to connect the files to the ImageField. This is relatively painless and doesn’t require you to use the save() method on the ImageField.

If you only want to set an ImageField to point to an existing file, assign a string containing the path relative to settings.MEDIA_ROOT to the field. E.g.:

This works as of (and possibly before) Django 1.3.

Note a couple of things:

Setting the path ignores your upload_to value. You have to prefix your string with that value yourself.

Again, you don’t have to call save() on the ImageField — just the model instance. There are some StackOverflow posts that advise going the route of calling save() on the ImageField, which requires you to open a file descriptor, etc. This isn’t necessary, will end up copying the file, and ignores any prefixed paths, resulting in a path that is always, in my testing, settings.MEDIA_ROOT/filename.jpeg.

Go: How to Get the Directory of the Current File

In Python I often use the __file__ constant to get the directory of the current file (e.g., os.path.dirname(os.path.realpath(__file__))). While writing a Go app recently, I found that there wasn’t an immediately obvious (or searchable, even with “golang”) equivalent.

But in the annals of the golong-nuts mailing list I eventually found out about runtime.Caller(). This returns a few details about the current goroutine’s stack, including the file path.

The context of my problem was opening a data file in a shared package. What I cooked up was:

Sending 1 to runtime.Caller identifies the caller of runtime.Caller as the stack frame to return details about. So you you get info about the file your method is in. Check the docs for more in-depth coverage.

It’s not quite as elegant as __file__ but it works.

Run Django Unit Tests in a Sublime Text 2 Build System

With the dev builds of Sublime Text 2, you can easily set up a build system that runs Django unit tests. You do this by adding a build system to your Sublime Text 2 project file for the Django project.

I’ll include an example project file in this post that runs manage.py test --noinput as the build command. In order for this to work with the current implementation of build systems, the project file must add the Django project dir and the virtualenv’s site-packages directory to the PYTHONPATH.

Note a couple of things:

  • In the project file, ${project_path} refers to the directory in which the project file exists: I created mine in the virtual env directory (the directory with the bin/ and lib/ directores for the virtualenv). For more info on other possible substitutions see the docs (note that I couldn’t get substitutions to work in the “env” dictionary)
  • My placeholder text django_project_dir stands for the directory that contains your Django project files inside of the virtualenv
  • This doesn’t use the python binary in your virtualenv — so your mileage may vary

One final thing to note is that I’ve added the lib/python2.7 directory as a folder in the project file. This is not related to the build system. It simply includes libraries in my “Find in File…” searches, allowing me to easily search for, e.g., Django classes and usages alongside my own. (SublimeRope is also a helpful tool for exploring.)

Anyway, here is the example project file:

Python: How to tell what class a decorated method is in at runtime

When profiling a Python app, it’s helpful to have a decorator that wraps functions and reports details about their performance. Assuming you are doing this to report some metric about the function, you’ll want the decorator to work with both bound and unbound functions (IE, regular functions and methods of a class), and if the decorator wraps a method, you’ll probably want to know the name of the class the method belongs to.

There is only one way (update: two ways) I’ve found to do this. They both involve the inspect module. (There are obvious ways to do it that don’t work with decorators.)

You can’t assume that the first argument in a function always refers to its class because that would break in the case that the function was not a method. However, the strong convention in Python is to name the first argument of methods ‘self’. Relying on this convention, we can easily tell if a function includes such an argument with the inspect module.

The inspect module’s getcallargs function will map the parameter names from a function’s signature to its arguments. Given a function fn that your decorator wraps, you can call inspect.getcallargs(fn, *args) at runtime and discover that the first positional argument is the ‘self’ parameter defined in the method’s signature!

Now, the use of ‘self’ is such a strong Pythonic convention that it’s good enough in most cases to use the presence of such a parameter in a function to know that it’s a bound function (again, a class method), and you can then inspect the ‘self’ argument for its class using __class__.__name__.

Here is an example decorator that will print out the class name of a decorated bound function:

Update: In Python 2.6, which does not include inspect.getcallargs, you can use inspect.getargspec to similar effect:

Make Sublime Text 2 More like Vim: Wrap Code, Go To Last Edit, Jump Back, and More

I’ve been trying out Sublime Text 2 as a replacement for Vim. While I enjoy using it and I experienced the “Wow, this does 90% of what Vim does” moment, I kept a running list of all the features in the remaining 10% that I relied on every day.

These included:

  • Better code wrapping (gq)
  • Go to last edit ('.)
  • Go to file in Ack search output
  • Display full path to current file (:echo expand('%:p')<CR>)
  • Jumping back and forward through files

However, Sublime Text 2 has a great Python API and I was able to whip up plugins for these tasks that perform just as well as in Vim. Update: I also found the navigationHistory plugin, which does a pretty good job of back/forward jumps.

Note: The following plugins are not verified to work with Sublime Text 3. If you’d like to contribute to updating them for ST 3, you can do so here.

Better code wrapping

Vim’s reformat text command (gq) can take multiple paragraphs and text in comments and flow them to the current textwidth setting. Some people don’t care about this, but I prefer to keep lines of code less than 80 characters wide, so I can open multiple files side-by-side. (Also, Pep-8.)

Sublime Text 2 had a “wrap” feature, but it failed to intelligently wrap comments, it joined separate paragraphs together and it wouldn’t reflow selected text (only the paragraph around the cursor).

The plugin I wrote creates a Wrap Code command (mapped to gq in Vintage mode) that works reasonably on commented lines of code (and uncommented lines), multiple paragraphs and selected text, thanks to the codewrap.py module written by Nir Soffer.

Download the WrapCode plugin on Github.

Go to the last edit

I’m used to typing '. in a buffer in Vim to move the cursor to the last edit. Great feature. Totally underrated.

If you use Vintage mode in Sublime Text 2, you’ll quickly discover that this command does nothing — worse than nothing, in fact, as it seems to refocus your cursor somewhere other than the text of the buffer, forcing you to use your mouse to recover.

While I couldn’t bind my plugin to the '. command without forking Vintage mode, I bound it to Super+' and it works the same as Vim’s.

Download the GotoLastEdit plugin on Github.

Easily open search results

Another thing I missed about Vim was its Ack plugin. The “Find in Files” feature of Sublime Text 2 is great, but it didn’t provide an easy way to quickly open a file listed in the search results via the keyboard (you can double-click on a line to open it, though).

So I wrote a plugin that, in a Search Results window, allows you to do one of the following via the keyboard:

  • On a “matched” line in the search output, open its file at the line of the match
  • On a file path in the search output (without a line number), open the file in a new tab

Download the OpenSearchResult plugin on Github.

Display full path to file in status bar

I use Vim in OS X’s full screen mode, with no tabs or status line. Working this way, I don’t have reference to the path of the current file. Of course, I know the name of the file because I usually typed it, but sometimes the full path is important; e.g., if I have two Mercurial branches of the same code in different directories.

So, I have a command mapped to ,F that displays the full path. Then it silently goes away after a moment. I love that command.

I couldn’t find an ideal way to implement this in Sublime Text 2, other than to create a command that would toggle displaying the path to the current file in the status bar. It works well enough for me, however.

Download the FilenameStatus plugin on Github.

Jumping back and forward

One of Vim’s most awesome features is Control-O/Control-I to jump back and forward — maybe I can generalize to say that all of Vim’s jump commands are part of its “killer app” status.

Sublime Text 2 doesn’t have a feature like this, but someone on the internet has packaged a version of Martin Aspelli’s navigationHistory plugin, which comes close.

Download the navigationHistory plugin on Github.

Installing the plugins

You install these plugins the same way as other Sublime Text 2 plugins, by downloading the files and dropping them into the Packages directory.

See this documentation for more details if you need additional help installing plugins.

The default key bindings are intended for Vintage mode and are oriented for OS X.

Sharp Edges: Protecting Ourselves from Digital Publishing

Nicholas Carr wrote in a recent article that he considered the ability of publishers to change text after they had released it “insidious” and a “bane” of digital publishing — specifically, if such changes are made in response to market research.

I agree that there is a challenge inherent in the new ease with which publishers may release versions of a text, but the challenge I see is different than the one Carr suggested. Data about the chapters that readers skip and areas that cause people to abandon their reading will only help publishers create better, more relevant content, just as this data has helped web site authors do the same. It is our response, as readers, to the possibility of frequent (and silent) revisions to text that I worry about.

Movable Text

As a summary of the key difference between print and electronic publishing, Carr described electronic publishing as having replaced Gutenberg’s movable type with “movable text.” For centuries, once set and printed, the text of a book remained the same. Today, with web sites and now ebooks, publishers may change the text at any time, introducing multiple divergent copies or, if the distribution method supports it, even changing the copy you are reading as you read it.

Carr wrote about the downside of this change that,

The promise of stronger sales and profits will make it hard to resist tinkering with a book … adding a few choice words here, trimming a chapter there, maybe giving a key character a quick makeover.

Books That Are Never Done Being Written

What is wrong with this? While some readers may finish every book they start, I have dismissed dozens of books in my life, at various points in their stories, after having read one too many missteps of voice or plot, or simply because I was bored. Life is short. There are more books to read than I have time for. So why should I read a poorly written book, and what do I care for the “shape” of a book (to quote Carr quoting Updike) if I can’t connect with it?

Creators and publishers who release work on the internet can get faster and more in-depth feedback from consumers than in traditional publishing — not just through comments but through analytics about a reader’s behavior. Analytics can show what people look for in a text and different ways they respond (e.g., most people stopped reading on page two). As anyone who has written a blog or maintained a web site will know, this information is extremely valuable as way of testing what people want to read and what they don’t.

The same will now be true for books. As readers, our reactions to books, not just the fact that we purchased them but more intimate details like how long we lingered on a page and where we stopped reading (if it wasn’t the end), will place us in tighter feedback loops with authors. How is this a “bane”?

Authentic Text

What is wrong with this model is that we must change our idea of the persistence and security of human knowledge to fit it. We have to create new mechanisms to ensure the authenticity of texts. There are measures we can take to accomplish this:

  • Creators should have control over changes made to their works, to protect themselves from publishers introducing alterations based on sales data.
  • Readers should have access to all released versions of a text. Each authorized edition of a text should have its own ID that is registered with a trusted authority. And if the publisher releases a new version of a book that is already on our ebook readers, we ought to have the right to approve whether or not we update to it.
  • A trusted organization of the public good should house a copy of each version of released texts, to reduce the chance that individuals, companies and governments can alter or destroy the source files.

In the traditional publishing model, a printed edition of a work with an ISBN is a “known good” source copy. It is authorized by someone. Multiple editions of a book may exist, but as readers and historians we can examine our authorized copies of these editions. We can protect them.

As we move toward using and relying on digital text, we must develop new means of protecting the authenticity of this information. The bane — the sharp edge — of “movable text” is not the ease with which we may change our books after we publish them. It is that our mechanisms for protecting ourselves from such change are outdated.

iA Writer and Notational Velocity

The OS X apps iA Writer and Notational Velocity are great for writing and taking notes, respectively.

I wanted to try using them together, by setting iA Writer as the external editor for Notational Velocity, but after doing so I noticed that iA Writer would not open with the keyboard shortcut (Command-Shift-E), and it was grayed out in the ‘Note->Edit With’ menu.

After tinkering, I found that Notational Velocity’s note storage must be set to plaintext files and the file extension must be ‘.txt’ (or ‘.md’, which is iA Writer’s default file extension). The rest of this guide will assume you want to use ‘.txt’ as the file extension for notes.

Full Requirements

So, the full requirements to use iA Writer with Notational Velocity are:

  • Set iA Writer as the External Editor in Notational Velocity
  • Use plaintext file storage for notes in Notational Velocity
  • Use the ‘.txt’ extension for note files

Set iA Writer as the External Editor

  • Open Notational Velocity’s preferences
  • Click ‘General’
  • Choose ‘iA Writer’ in the External Editor drop-down

Use Plain Text Storage for Notes

  • Open Notational Velocity’s preferences
  • Click ‘Notes’
  • Click ‘Storage’
  • Choose a folder to save and read notes from in the ‘Read notes from folder’ drop-down (I suggest a Dropbox folder)
  • Choose ‘Plain Text Files’ in the ‘Store and read notes as:’ drop-down
  • Highlight ‘txt’ in the Extension list and click the check-box to make it the default extension

Changing Existing File Extensions

If iA Writer is still grayed out in the ‘Note -> Edit With’ menu or fails to respond to the external editor keyboard shortcut, make sure that the notes you are editing have the ‘.txt’ extension.

All of your notes should now be stored as individual files in the folder you chose in Notational Velocity’s preferences. To change the extension on these files, open that folder in Terminal or Finder and rename the files so that they end with ‘.txt’.

Batch Renaming Files in Terminal

If you have tons of notes and need to change all of them to a new extension, open Terminal and cd to the folder you chose for note storage in Notational Velocity. Then run the following command (assuming the original extension of the files was ‘.wiki’, which it was in my case):

    for old in *.wiki; do mv "$old" "`basename $old .wiki`.txt"; done

Done… Bonus Round?

You should be set now.

However, iA Writer uses the Markdown format, so as an extra bonus you could add a Markdown plugin to Vim, allowing you to easily edit the same text files in Notational Velocity, iA Writer and Vim, which is of course the greatest text editor of all time.