About b4winckler

Mathematician / MacVim developer

Creating and sharing an R package which calls C code

In this post I will demonstrate the minimal amount of steps needed in order to write and share an R package which calls C code. It grew out of my own experiences in writing and sharing the LXB package with colleagues at work.

I have created a repository at GitHub called reverse which contains a complete example of a minimal R package (as well as the latest version of this document). In order to follow this post you may want to clone that repository.

The Writing R Extensions manual contains a reference for everything that is mentioned here (and more).

Packages with C code (more painful than you’d like unless your intended audience is programmers)

My assumption is that you want to build an R package and share it with your colleagues. As it turns out, this is not so simple. When you create a package it will by default just be an archive containing the source code that you write. This means that a C compiler is needed in order to install the package. If your colleagues all have a C compiler installed (maybe they are programmers or all use Linux) then this is no problem. If not, get ready for a world of pain.

It is possible to create binary packages that do not require a C compiler for installation, the catch is that they will only install on a machine with the same operating system version and R version as was used to build the package. You can get around this problem by uploading your package to CRAN, since it creates binary versions of your package for you. However, submitting to CRAN involves quite a bit of work. You have to make sure your package compiles on Linux, Mac OS X as well as Windows. The submission process is not automated, so expect a delay of a couple of days between submission and a binary version of your package being available.

Before going any further you should ask yourself if the pain is worth it. If you can get away with writing all your intended functionality in R without it being too slow or using too much memory, then I’d suggest you stick with a pure R solution.

How to call C from R

Let me outline how R calls out to C without getting into the details of the R Foreign Function Interface since this would take too much space.

R calls a C compiler to build a shared library of your C code. This library can then be dynamically loaded into R and you can call functions that you have exported in your C code. Here are the steps involved:

  • To generate a shared library call

    R CMD shlib reverse.c

    in the src/ directory. This will create the shared library reverse.so. The file name extension depends on which operating system you are using (I am using Mac OS X).

  • To load the library inside R you call dyn.load('reverse.so').
  • Finally, to call the function exported in your C call .Call('reverse', 1:10). Here reverse is the name of the function exported in reverse.c and 1:10 is the (only) parameter this function takes.

Manually generating a shared library is a bit messy (it generates .o and .so files in the directory where your source code is) and using .Call() can be slightly dangerous (e.g. what happens if you pass the wrong type of parameters?). The solution is to generate a package.

Preparing a package

A package is a directory with the structure of the reverse repository (there may be more files and folders, but I tried to make the package minimal). Here is an overview of what goes where in the package:

  • Put the C source code in the src/ subdirectory.
  • Create an R wrapper function which calls out to your C code and put it in the R/ subdirectory. In this example it is sufficient to have a wrapper function like reverse <- function(x) .Call('reverse', x) but you may want to coerce any variables before passing them to your C function. Note that it is not necessary to load the shared library with dyn.load() in the wrapper. R takes care of this for us.
  • A NAMESPACE file which tells R to load the shared library and what wrapper functions to expose from the R/ subdirectory. In this example useDynLib(reverse) loads the library, and export(reverse) exports the reverse function from the R wrapper code.
  • A DESCRIPTION file which contains a summary of the package. For example the package version is entered here. Please use a sensible scheme for versioning your package (e.g. X.Y, where X is incremented when changes to the public API exposed inside R/ breaks backward compatibility, and Y is incremented otherwise). You also need to enter the version of R your package depends on; at the moment I am not sure what the best practice is when filling out this field. A safe bet is the version of R that you are using, but this may be too restrictive. Another idea is to pick the lowest version that is used by the people you are sharing your package with (if this is known to you). As for the license field — if you pick a non-standard license then you will get a warning when you check your package (see below), so it may make sense to pick a standard license.

These four items are all that is needed to create a functioning package, but you will get a warning about missing documentation when checking the package (see below). To avoid warnings you will also need

  • A man/ folder with documentation for the package itself and for each function that is exported in the NAMESPACE file. The format used for documentation is described in the guide on extending R. Note: you should include code examples in your help files. These examples are run when you check the files so make sure your examples are complete. If you want to write examples that can’t be run for some reason, you need to wrap them in \dontrun{}.

Building and installing

Creating a package which is ready to be shared consists of the following steps:

  • Building the package. This creates an archive that you can share. Change directory to the parent of the package and type

    R CMD build reverse

    This will generate the archive reverse_1.0.tar.gz, where 1.0 is the version of the package.

  • Checking the package. This ensures that the package can be installed. Type

    R CMD check reverse_1.0.tar.gz

    If there are any problems you will be notified and a log file is created in the directory reverse.Rcheck/. Note that you should pass the name of the archive to the check command, not the name of the directory the package resides in (this can be confusing because the latter works but it creates temporary files inside your package directory structure).

At this point you can go ahead and install the package by typing

R CMD install reverse_1.0.tar.gz

Now start R, make sure the library loads by typing library(reverse) and that the exported function works by typing reverse(1:10) (you should see the numbers 1 to 10 in reversed order).

To uninstall the package type

R CMD remove reverse

If you only want to distribute the source version of your package then you are done at this point. Simply send the reverse_1.0.tar.gz archive to the people you want to share with and tell them how to install your package. The downside to this is that everybody you share with must have a C compiler installed. If you want to share with people who may not have a C compiler, then you need to create a binary version of your package.

To create a binary version of your package you use the command

R CMD install --build reverse_1.0.tar.gz

This will first install your package and then create a binary package archive called reverse_1.0.tgz which you can share (the installation proceedure for binary packages is the same as for source packages). The problem with this method is that the binary package will only install on computers with the exact same operating system version and R version that you used to build the binary package. To work around this problem you will have to submit your package to CRAN.

Submitting to CRAN

Before submitting to CRAN you should ensure that your package passes all checks. This is not really a problem. What is a problem however is that packages on CRAN should compile on Linux, Mac OS X and Windows and it is up to you to make sure that it does. Getting your C code to compile on all three platforms can be a big problem. There is a site called win-builder to which you can upload a source package and it will automatically check it on a Windows machine. This is useful if you do not have access to Windows, but it can be very time consuming to fix compilation problems this way. I do not know of any similar sites for Linux and Mac OS X so if you do not have access to one of these operating systems then you are out of luck.

To actually submit a package you should first read through the submission guidelines (you will be forced to confirm that you’ve read through this later anyway). Next, upload a source package to the CRAN ftp and send an email to the CRAN mailing list (current addresses to the ftp and mailing list can be found at CRAN). In your email the subject should include package name and version (e.g. "reverse 1.0") and in the body simply ask for the package to be added to CRAN. The submission process is not automated, instead the mailing list is read by the maintainers so be polite. You will get a reply to your submission, if you need to reply back make sure you CC the mailing list again as there may be more than one maintainer handling your case. If problems are found in your package and you need to upload a new version, make sure you send a new submission email as well since the maintainers expect a submission email to accompany each archive uploaded to the ftp.

Setting up Xcode 4.3 (for MacVim, Homebrew and Haskell)

Xcode 4.3 was released recently and one of the changes it brought with it was that the /Developer folder now has moved into the Xcode app bundle. This has caused headaches for lots of developers and MacVim was not spared either. I recently did a clean install of Mac OS X Lion and Xcode 4.3 and thought I’d document my experiences in this post.

After installing Xcode 4.3 (my version says 4.3.1) you must manually go into the Xcode preferences, select the Downloads tab and install the Command Line Tools. Even after this step you will not be able to use automake and friends; these have to be installed manually.

My intention was to use Homebrew with my fresh install (I had never bothered with this before as my /usr/local was occupied by my stuff and Homebrew strongly advices against that). However, before Homebrew will work you have to tell Xcode where the Developer dir is, otherwise Xcode still thinks it should be at /Developer (this is after a clean install mind you, so I think this is a bug in Xcode 4.3). Open up Terminal and enter:

$ xcode-select -switch /Applications/Xcode.app/Contents/Developer

Now you can install autotools with Homebrew by typing

$ brew install autoconf automake

With this setup it is now possible to compile MacVim without any problems (actually, you don’t need autotools to build MacVim since it comes with a pre-generated configure script but I need autoconf to generate said script).

Lastly, it turns out Haskell was broken by Xcode 4.3 as well (cabal would complain about gcc not being found). To fix it, open up /usr/bin/ghc with a text editor and look for the line which says pgmgcc="/Developer/usr/bin/gcc" and change this to say

pgmgcc="/usr/bin/gcc"

Using the conceal Vim feature with LaTeX

Vim 7.3 has just been released and with it comes the “conceal” feature (you can download MacVim 7.3 here). One neat application of this feature is that when editing LaTeX files certain backslash commands are replaced by their corresponding Unicode glyph. This is what I am talking about:

You’ll see that Greek letter commands, superscripts/subscripts and mathematics commands are concealed and in their place is rendered the corresponding Unicode glyph. The cursor line however is rendered as is without any concealment so that you can still edit the LaTeX code. (I have on purpose made this line identical to the line above it to let you see what has been concealed.) Inline mathematics (that which goes between two dollar signs) is shown on the last line. Note that the dollar signs are hidden. All in all this makes it a lot more pleasant to skim through a .tex file (but it won’t replace compiling and reading the pdf instead).

To help get you on your way there are a few things you need to know in order to get started with the conceal feature. First of all you need to enable it by typing :set cole=2. You’ll immediately notice lots of grey on grey characters…uh, what? This is the (unfortunate) default syntax coloring for concealed items. To fix it you have to change the Conceal highlight group, e.g. try :hi Conceal guibg=white guifg=black (reverse the colors if you are using a dark color scheme). After fiddling a bit with the colors to match your color scheme you are ready to go!

However, I have found that concealed superscripts and subscripts often do not look very good and fortunately there is a way to disable them. Namely by adding the line let g:tex_conceal="adgm" to your ~/.vimrc file (it also works to put this line in ~/.vim/ftplugin/tex.vim as mentioned below). The g:tex_conceal variable is a string of one-character flags:

a = conceal accents/ligatures
d = conceal delimiters
g = conceal Greek
m = conceal math symbols
s = conceal superscripts/subscripts

Thus "adgm" means conceal everything except superscripts and subscripts. (I did not mention accents/ligatures earlier but "a" does what you’d expect: for example, \"a and \ae will turn into ä and æ respectively, if accents are enabled.)

The conceal support for editing tex files is still in its early stages and you may come across commands that do not get concealed, or perhaps you have some custom LaTeX commands that you would like to add to the list of concealed commands. In either case you should edit the file ~/.vim/after/syntax/tex.vim (create the folders and the file if they don’t exist). Say you would like \eps to render as ε, then add this line:

syn match texGreek '\\eps\>' contained conceal cchar=ε

Mathematics commands should be added to the texMathSymbol group. For example, if you want \arr to be concealed by ←, then add this line:

syn match texMathSymbol '\\arr\>' contained conceal cchar=←

If you find standard LaTeX commands that should be concealed but aren’t, please notify the tex syntax file author so that he may add them (you can find the contact details by looking at the syntax file :tabe $VIMRUNTIME/syntax/tex.vim).

Finally, I personally edit several different types of files and like to keep separate settings for each file type. The simplest way of doing this is to keep your filetype-specific settings inside ~/.vim/ftplugin/filetype.vim. Here’s an excerpt from my ~/.vim/ftplugin/tex.vim file:

" Set colorscheme, enable conceal (except for
" subscripts/superscripts), and match conceal
" highlight to colorscheme
colorscheme topfunky-light
set cole=2
let g:tex_conceal= 'adgm'
hi Conceal guibg=White guifg=DarkMagenta

Some of the relevant help files on this topic are :h 'cole, :h 'cocu, and :h conceal. I should also mention :h 'ambw; it may be helpful to set this to double if you find that some Unicode glyphs “spill over” into the neighboring display cell.

MacVim Services (again)

In a previous post I discussed MacVim Services on Mac OS X 10.5 (Leopard) and earlier. With Mac OS X 10.6 (Snow Leopard) Apple has polished Services to make them more easily accessible, but unfortunately this broke some of the MacVim Services at the same time.  As of Snapshot 52 (released today!) MacVim Services work on Snow Leopard and in this post I’ll quickly demonstrate how they can be put to good use.

MacVim now exposes two Services: “New MacVim Buffer With Selection” and “New MacVim Buffer Here”. Both can be accessed in the usual (pre-10.6) manner via the Services submenu of the current applications menu, or via a context menu that pops up when you control-click (or right-click) something. The context menu is new in Snow Leopard and makes it so much easier to access Services.

The first Service (New MacVim Buffer With Selection) is available when you control-click the selection in any application (e.g. Safari). When used it will copy the selection, open a new MacVim buffer, and paste the selection into the buffer so you can start editing it.

The second Service (New MacVim Buffer Here) is available when you control-click a file or folder inside a Finder window. When used it will open a new MacVim buffer and set the current directory to that of the file or folder you had selected. This can be handy if you’ve browsed to some folder in the Finder and want to create a new text file inside that folder: simply control-click on any file in the folder, select the Service, add some text, then type :w filename to save the buffer in a file called filename in the folder you had open in the Finder.

Finally, if you don’t want these menu entries clogging up your context menus there is an easy way to disable them: open up System Services, click on Keyboard and select the Keyboard Shortcuts tab. In the left-hand list click on Services to bring up a list of avaiable Services in the right-hand view. Search for the Services you don’t want and untick them one at a time and they won’t bother you again.

MacVim on Snow Leopard

As a courtesy to early adopters I am posting a link to a custom binary of MacVim that I currently use on Snow Leopard (+cscope, +perl, +python, +ruby, +tcl, 32 bit Intel, 10.6 only).  When I get time I will make a proper snapshot and post it via the usual channels and remove this binary [edit: it has been removed now].  Note that I cannot provide any support for this binary.  If you do run into problems I would appreciate if you report it on the vim_mac mailing list (not in the comments here).

You can always build your own binary but do note that the icon generation currently is broken.  This can be worked around by commenting out lines 52-57 and 242 in the src/MacVim/icons/docerator.py script. All build issues have been fixed now and the build procedure has been simplified, so go ahead and build your own 64 bit binary — it has never been easier.

To conclude this story: I have now uploaded a new snapshot that will run as 64 bit on Snow Leopard.  Enjoy!

Using gettext in a Cocoa application

Vim uses gettext (from libintl) to support localized messages like (I’m guessing) many *nix programs do. Mac OS X on the other hand uses bundles for localized messages so I had to struggle somewhat to get these two to cooperate when internationalizing MacVim. In this post I’ll describe how gettext and Cocoa decides which language to use for localized messages and how I managed to get both to choose the same language.

The “International” System Preference is used to control which language to use for localized messages. Unfortunately, Cocoa applications uses the “Language” tab setting whereas gettext uses the “Formats” tab. For example, if I choose English as my preferred language in the “Language” tab, but my “Formats” tab region set to “Sweden”, then gettext will use Swedish for messages (if available) but Cocoa will use English.

My preference was to get gettext to behave in the same manner as Cocoa instead of the other way around. Fortunately, this is made easy via the LANGUAGE environment variable. This is a colon separated list of preferred languages (e.g. sv:en) and it takes precedence over the LC_* and LANG environment variables when gettext chooses which language to use for localized messages. The list of user-preferred languages (as set in the “Language” tab) can be accessed via +[NSLocale preferredLanguages]. The following Objective-C code sets LANGUAGE to match the user’s choice:

NSArray *languages = [NSLocale preferredLanguages];
if (languages && [languages count] > 0) {
    int i, count = [languages count];
    for (i = 0; i < count; ++i) {
        if ([[languages objectAtIndex:i]
                isEqualToString:@"en"]) {
            count = i+1;
            break;
        }
    }
    NSRange r = { 0, count };
    NSString *s = [[languages subarrayWithRange:r]
            componentsJoinedByString:@":"];
    setenv("LANGUAGE", [s UTF8String], 0);
}

One note about this code: I’ve only included fallback languages up to (and including) “en” (English) since Vim has strings in English in the actual code and hence there is no .mo file for English translations. If I did not disregard all languages after “en” then English would never be used. Also note that the last parameter in the call to setenv() is 0 so that if the user has already set LANGUAGE then we do not override the user’s choice.

Inverse functions in Haskell

In this post I will show a simple way of finding the inverse of a function on the real line in Haskell.

Let f be a continuous (real) function defined on a closed interval [a,b]. The inverse of f is well-defined if and only if f is injective (i.e. no value can be assumed at two distinct points, i.e. f(x)=f(y) implies x=y). This is not a property that Haskell can easily check for us, so it is up to us to make sure that f is injective. From now on, we assume that f is injective.

The problem at hand is this: for each y in the range of f find x such that f(x)=y (the range of f in our case is simply the interval [f(a),f(b)], or [f(b),f(a)] if f(b)<f(a)). The function which maps y to x is called the inverse of f. This problem can be solved by introducing a new function F(x)=f(x)-y and then finding the zero of F for each y in the range of f. Note that I say the zero here, since f is assumed to be injective and hence the zero is unique. In general the function F need not be linear, so we’ll need a non-linear equation solver to tackle this problem.

To find zeros of a non-linear equation I’ll use the bisection method. Given f and two points l<r such that f changes sign on the open interval (l,r), the bisection method halves the interval and picks a subinterval where f changes sign and repeats. If the function is zero at the midpoint the method stops (otherwise f must change sign on at least one of the two subintervals since f is continuous and hence the intermediate value theorem applies).

When implementing the bisection method we will eventually run out of precision and the interval can no longer be halved. Our implementation checks for this condition first, and bisects only if it is possible. (For simplicity, we always bisect to maximum precision in this implementation instead of allowing an arbitrary precision to be specified. This is fine as long as we’re using fixed precision arithmetic and the function f does not take too long to evaluate.)

> bisect' f l r
>     | m <= l    = l
>     | m >= r    = r
>     | f l * f m < 0 = bisect' f l m
>     | f m * f r < 0 = bisect' f m r
>     | otherwise = m
>     where m  = (l+r)/2

Note that f(l)*f(m)<0 implies that f changes sign on the open interval (l,m) and f(m)*f(r)<0 implies that f changes sign on (m,r).

The function bisect' is guaranteed to find a zero if f changes sign on (l,r). If f does not change sign then it will still return a solution even though there may be none. To fix this problem so that our bisection method only returns a zero if there is one we call this function from a “wrapper” which checks that f changes sign. Also, we make some sanity checks on the endpoints a and b.

> bisect f a b
>     | a > b = bisect f b a
>     | f a * f b < 0 = bisect' f a b
>     | f a == 0 = a
>     | f b == 0 = b
>     | otherwise = error "bisect: failed"

Normally the return value should be wrapped in a Maybe and return Nothing instead of raising an exception in the case where the method fails, but for my purposes I really want the program to halt if there is no solution, hence the use of error.

With bisect in hand we can now easily find the inverse of a continuous and injective function f defined on the closed interval [a,b]:

> inverse f a b y = bisect (\x -> f x - y) a b

Let’s use this to find the inverse of a function f on [0,1] which cannot be inverted “by hand”. Note the f below is injective on this interval, but not on any interval which strictly contains [0,1]. In GHCi:

*Main> let f x = x^2 * abs (tan x)
*Main> let g = inverse f 0 1
*Main> g 1
0.8952060453842319
*Main> f it
1.0
*Main> f (g 0.3)
0.30000000000000004
*Main> it - 0.3
5.551115123125783e-17

That looks good! (Haskell uses double precision floating point, so the last result is zero up to machine epsilon [approx. 10-16 for double precision].) Note that you have to be careful with the inverse g since it is only defined on the range of f, namely [f(0),f(1)] in this case. At this point it would probably make sense to write some QuickCheck tests, but I’ll stop now anyway.

Coercing the Cocoa text system

I just spent the last couple of days refactoring keyboard input handling in MacVim and thought I’d write down some of my findings here for posterity. I’ve read Apple’s documentation on the Cocoa text system over and over again, but it assumes that you are writing a new Cocoa application from scratch and it doesn’t really cover the case when you are porting an app which already has its own way of dealing with keyboard input. If you are in the latter situation, these notes may be of use to you.

My expectations in MacVim are these: I want to enable the use of different input sources (different keyboard layouts as well as more complex input sources such as Kotoeri) but at the same time I need to pass raw key events (which key was pressed and which modifiers were held?) on to Vim since it does its own processing.

Keyboard input arrives in an NSView derived subclass as follows (by “Cmd keys” I mean keys pressed at the same time as Cmd is held and similarly for Ctrl/Alt/Shift):

  1. Cmd and Ctrl keys are sent to performKeyEquivalent: before being passed on to the app menus for key equivalent handling. Note: on Tiger Ctrl keys go directly to keyDown: and (more importantly) if you don’t handle a Cmd key here it will never reach keyDown:. In Leopard Apple fixed this problem: unhandled Cmd key presses (i.e. ones that did not have a menu item bound to them) are sent to keyDown:.The way MacVim handles Cmd key presses on Tiger is to call [[NSApp mainMenu] performKeyEquivalent:event] inside performKeyEquivalent: and if that returns NO send the Cmd key on to Vim and return YES. This way it is possible to bind menu items to key equivalents and let Vim deal with unbound Cmd keys.
  2. Next keyDown: is called. Ideally at this stage I’d like to inspect the NSEvent and pass it on to Vim, but doing so would disable input sources like Kotoeri. To give these a chance to respond to the key event you must call interpretKeyEvents:. When this is done, one of the following will happen:
    • insertText: is called with text to insert (in the form of a NSString or a NSAttributedString)
    • doCommandBySelector: is called with a selector whose name indicates what should happen (e.g. insertNewline:) but the selector name can also be noop: (which is meaningless)
    • the current input source swallows the event and tells us to add marked text (e.g. this happens when you press Alt+i on a US keyboard)
    • the input manager swallows the event and nothing happens (e.g. you press Cmd+Alt and some key which was not bound to a menu item)

If this looks complicated, well, that’s because it is! Lets discuss in more detail the things that can happen as a result of calling interpretKeyEvents:.

Normal text

“Normally” insertText: is called with the text to insert, e.g. when you hit a key by itself or with Shift held. This is also the case with most Alt keys (on a US keyboard layout) except for some “dead keys” such as Alt+i and Alt+e. Be careful with the parameter to insertText: — it may be either a NSString or a NSAttributedString. In MacVim I explicitly check for the latter case and extract the string from it (since I can’t support the attributes anyway) as follows:

- (void)insertText:(id)string
{
    if ([string isKindOfClass:[NSAttributedString class]])
        string = [string string];

Note that insertText: may be called with more than one character, e.g. when inserting previously marked text. Typically this happens when you use complex input sources such as Kotoeri. Also, insertText: may be called without any keys being pressed at all. For example, the “Special Characters” palette may be used to insert text (this palette can usually be found on the “Edit” menu) so don’t rely on the current event being a key event inside insertText:.

Key bindings

Keys that have a key binding on the other hand will cause doCommandBySelector: to be called, e.g. Ctrl+[ is by default bound to cancelOperation:. In this situation it is a bit tricky to figure out what to do. One possibility is to do nothing and then back in keyDown: look at the NSEvent object to figure out what to do (e.g. by calling characters on the event) and this may work in most cases. However, there are several cases I know of where it does not work so well.

For example, if you press "Alt+i, Return" then the first press (Alt+i) causes marked text to be added, and the second (Return) will first cause the input manager to call insertText: with the marked text (^) and then doCommandBySelector: is called with the selector insertNewline:. If you inspect the [NSEvent characters] in keyDown: for the event generated by the Return press, it will return ^\x0d and not \x0d as you would expect (i.e. the event contains the result of the Alt+i press as well!). In MacVim I check for certain selectors (e.g. insertNewline:) in doCommandBySelector: and pass an appropriate key on to Vim, but most selectors I ignore and deal with it in keyDown: instead.

Swallowed key events

The problem with the input manager swallowing events is hard to deal with. It can be solved by adding appropriate key bindings but currently there is no public API to modify the key bindings at run time.

For example, Cmd+Alt keys will be blocked, but if you add an entry to the key bindings dictionary with the key "@~" and value noop: (@=Cmd, ~=Alt) then the input manager will pass any Cmd+Alt presses on to doCommandBySelector: with selector noop:. One way to deal with this via Private APIs is to set your own key binding dictionary when initializing your app and make sure your own dictionary contains entries such as "@~" to let key presses through. This is the API you can use to do so:

// Private AppKit API (from class-dump)
@interface NSKeyBindingManager : NSObject
+ (id)sharedKeyBindingManager;
- (id)dictionary;
- (void)setDictionary:(id)arg1;
@end

And here is how to set your own dictionary (assuming dict has been populated with your own key bindings already):

NSKeyBindingManager *mgr =
    [NSKeyBindingManager sharedKeyBindingManager];
[mgr setDictionary:dict];

I have yet to find any problems caused by this, but try it at your own risk. One thing to note is that your custom key bindings dictionary should contain most (if not all) of the bindings found in the system default key bindings dictionary

/System/Library/Frameworks/AppKit.framework
            /Resources/StandardKeyBinding.dict

else keyboard navigation may not work in dialogs since they rely on these standard bindings (e.g. forget to bind Return to insertNewline: and you won't be able to use Return to dismiss dialogs).

The unexpected Ctrl+q

Another gotcha I came across in my investigations was that Ctrl+q always seemed to be swallowed by the input manager and then the next keystroke would get mangled in one way or another. The problem turned out to be a User Default called NSQuotedKeystrokeBinding. By default this is set to "^q" (Ctrl+q) and is used to literally enter keys that would otherwise be interpreted by the input manager. Hence, if you ever want to receive any Ctrl+q presses this needs to be disabled, which fortunately is easy using a little bit of Core Foundation:

CFPreferencesSetAppValue(
        CFSTR("NSQuotedKeystrokeBinding"),
        CFSTR(""),
        kCFPreferencesCurrentApplication);

This should be called in the initialization of your application.

Marked text

Apple is a bit vague on how to support marked text handling in custom views. Basically, they say you should implement the NSTextInput protocol but then they don't exactly explain how this is done.

One method that seems to work well is to keep a NSString and a NSRange in your custom view and whenever setMarkedText:selectedRange: is called you update your marked text and range variables. Occasionally the input manager will ask if there is any marked text and what the selected range is by calling markedText and markedRange. When the user finishes entering marked text (e.g. by hitting Enter in Kotoeri manager), the input manager calls insertText: to let you know that it is finished with the current marked text entry. As a result you should always clear you local marked text and range variables. Note that the input manager almost never explicitly calls unmarkText to unmark the text, which is why you need to do the unmarking manually in insertText:.

Command keys

When a Cmd key press reaches keyDown: I ideally would like to know which key was pressed and which modifiers were held but there is a problem here in that Cocoa sometimes will add the modifiers into the key and sometimes not. This turns into a game as to whether to extract characters or charactersIgnoringModifiers from the NSEvent and trying to guess which of the flags in modifierFlags have already been included in the event.

This is already turning into one massive post so let me just summarize the heuristic I finally settled on:

  • If Shift and/or Alt is held, then use charactersIgnoringModifiers, otherwise use characters
  • The Shift flag is always included in the key so clear it from modifierFlags. The Alt flag should only be cleared if the Ctrl flag is set as well.

Note that I could not find any (reasonably easy) way of extracting which unmodified key on the physical keyboard was pressed. Even though the method is called charactersIgnoringModifiers it will some times include the modifier flags (e.g. Shift, Alt) and sometimes it won't.

MacVim Services

Many people probably never use the “Services” item that appears on the application menu in every Mac OS X application. At least I never used to, and for two reasons: I didn’t even know it existed for a long time and then when I did learn about it I could never be bothered navigating through all those submenus. I’ve since found a good way to actually use the Services menu, namely via Mac OS X’s ability to customize keyboard shortcuts for any menu item in any application.

Consider the MacVim service “Open Document Containing Selection” (phew). With it you can select some text in (almost) any application, run the service and a new MacVim window will open up with the selected text. Using the Services menu to do this would take a lot more time then to simply ⌘C, switch to MacVim, ⌘N, then ⌘V but by assigning your own keyboard shortcut to this service it actually becomes useful.

Here’s how to assign ⌃⌥⌘V to this service (yes, that’s a lot of keys, but at least it’s unlikely to be assigned to some other menu and it is not too awkward to type):

  1. Click the Apple Menu
  2. Choose “System Preferences…”
  3. Click “Keyboard & Mouse”
  4. Click the “Keyboard Shortcuts” tab
  5. Click “+” button in the bottom left corner
  6. In the sheet that pops down, choose “All Applications” from the drop down menu
  7. In the text box next to “Menu Title” enter “New Document Containing Selection” (double-check for typos or the shortcut won’t work)
  8. Click the edit box next to “Keyboard Shortcut” and hold down Control, Option, Command and v at the same time, then click the “Add” button

Now try it out! For example open Safari, select some text, and hit ⌃⌥⌘V. If it does not work then you have most likely misspelled the menu item in step 7, or you have chosen a shortcut that is already in use (⌃⌥⌘V worked in Safari for me). Finding a shortcut that works across all applications can be kind of frustrating and sometimes you have to restart an application before it will notice a change to the shortcut.

Another tip is to assign a shortcut to the “New Document Here” service for Finder only (choose “Finder” instead of “All Applications” from the drop down menu in step 6 and enter “New Document Here” in step 7). Using that shortcut will open a new MacVim window and the current directory will be set to whatever folder you are currently browsing in the Finder. I often use this to create a new document on the Desktop by clicking the Desktop, hitting the shortcut, enter some text, then hit ⌘S and enter a file name. Give it a try.

A minimal Vim configuration

I have seen plenty of people describing how they trick out Vim with fancy plugins and (relatively) large rc-files. What I want to do is to point out what I consider a minimal setup that allows me to use Vim without going insane.

MacVim comes with a few default settings that other Vim ports lack. If you are not using MacVim I highly suggest you create a ~/.vimrc file and add these three lines:

set nocp
set bs=indent,eol,start
syntax on

With that out of the way, lets get started. First of all, you will need to create the file ~/.gvimrc unless you have already done so. To do this, open Vim and enter :e ~/.gvimrc (you may use ~/.vimrc if you prefer, but some of the options below only make sense in the GUI so you may as well stick with ~/.gvimrc unless you have a good reason not to).

These are the options that I consider essential:

Turn off the blinking cursor in normal mode:

set gcr=n:blinkon0

The blinking cursor kind of stresses me out so it is nice to have normal mode as a sort of haven away from insert mode. This may also work as encouragement not to stay in insert mode too much for those who are new to Vim. (Replace with set gcr=a:blinkon0 if you want the blinking to stop altogether!)

Enable search-related options: highlight matches in a search (hls), show the current matching pattern as you search (is), ignore case (ic) unless you are searching for both upper and lowercase letters (scs). The second line adds the mapping Ctrl+n to quickly disable the currently highlighted search term (using the command :noh):

set hls is ic scs
nmap <silent> <C-n> :noh<CR>

The Ctrl+n mapping only works in normal mode, see :h map-modes.

Enable automatic file type detection:

filetype plugin indent on

Each time a recognized file type is opened this option will cause Vim to set several other options to ease editing of that particular type of file.

Use 4 spaces for indentation and replace tabs with spaces in a smart way:

set sw=4 sts=4 et

Choose whichever number of spaces you prefer for indentation by replacing 4 with the number of your choosing.

Yep, that’s really all that I think is necessary but let me still list a few more settings that almost made it on that list:

Hide the toolbar:

set go-=T

Maybe one day the toolbar will be useful for something, but right now it isn’t and with it gone you’ll be able to fit at least another line of text on your monitor.

Show the cursor line and column number:

set ruler

The scrollbar is a great visual indication as to where you are in the file but I still find this option very useful.

Increase the default number of lines:

set lines=35

I use this as a sensible default for quick edits and if I start editing in earnest I have a tendency to hit Cmd+Ctrl+z to maximize the window (MacVim only).

Use a dark background: In MacVim the default color scheme’s dark variant is quite nice:

set bg=dark

For other GUIs you’ll need to find a nice dark scheme and use the :colorscheme command (see :h colorscheme). I use dark color schemes since they tend to be much easier on the eyes. (In MacVim black backgrounds can make it hard to distinguish overlapping editor windows (because windows have no borders on Mac OS X) so I often add the line hi Normal guibg=Grey8 to make the background almost-but-not-completely black.)

One final comment; I do not use hidden buffers when editing (which means the undo history is lost when a buffer is hidden). Instead I open separate tabs for each file that I am currently editing. This works well when editing only a handful of files at a time but if you need to edit a lot of files at the same time you may want to take a look at the hidden option (see :h 'hid).