Using the Intel 510 Series SSD in a 2011 MacBook Pro at full speed and with TRIM

I just got a new MacBook Pro from my current employer and since I got it without an SSD I bought the Intel 510 250GB and installed it. Everything worked smoothly after the first boot. However, as @denis2342 pointed out, there are a few extra steps to make it run at full speed and performance.

First of all, although this MacBook Pro has a SATA-III interface with up to 6 Gigabit, the System Profiler only showed a »Negotiated Link Speed« of 3 Gigabit. In order to make it negotiate to 6 Gigabit a SMC reset has to be performed. Basically you have to press the (left side) Shift-Control-Option keys and the power button at the same time and after that you have to boot normally.

After that System Profiler showed a »Negotiated Link Speed« of 6 Gigabit.

Then, although OS X enables TRIM support for Apples own SSD drives on the latest MacBook Pros, it doesn’t enable it for 3rd party SSDs. There were workarounds which involved patching a CoreFramework it was kind of messy and not something you’d recommend to any beginner. Luckily there is now a tool called »TRIM Enabler« which allows to backup and restore the Core Framework library and also to patch it with the click of a button. This also worked as expected and after another reboot the System Profiler showed that TRIM was enabled for my 3rd party SSD.

After I ran an Update the TRIM support was disabled again and I had to run TRIM Enabler once more.

I really hope that Apple is enabling TRIM for all SSDs with Lion to make this step unnecessary.

That is about it. This SSD is really blazing fast. If you’re interested, there is a nice in-depth review at anandtech.com

While the SSDs from other vendors are still faster, the Intel SSDs offer a higher reliability.

Mac OS X Keyboard Shortcut for locking the screen

For years I’ve been searching for a keyboard shortcut that would instantly lock my screen / desktop. I’ve seen this on Linux window managers and something like this probably exists on every major operating system.

My current workaround was to open Keychain.app, enable the menu bar icon in the preferences and click each time on the menu bar item -> lock screen.

I tried AppleScript, I tried Automator, I even convinced a friend that it would be necessary to write a small app for this. Recently however, somebody said that there is such a shortcut and that it exists since Mac OS 8.

This holy shortcut is:

⌃ + ⇧ + ⏏ (Control + Shift + Eject)

On Macs without an Eject key you can use the power button in the shortcut instead.

You have to enable »Require Password immediately after sleep or screen saver begins« in the System Preferences -> Security to make this truly lock your screen.

There are probably many other shortcuts in Mac OS I don’t know. If you have any hints where to get a complete list, please let me know!

High Sierra Update

With macOS 10.13 there is now a dedicated screen lock feature which can be invoked with this new shortcut:

⌃ + ⌘ + Q (Control + Command + Q)

Other Updates

  • In Mac OS X terminology this shortcut puts the display immediately to sleep.
  • A website listing this and other shortcuts can be found here

Files with mixed and invalid Encodings in Ruby

Recently I encountered a file which mostly contained UTF-8 characters. I could read the file and even throw it at Nokogiri and there was no problem.

When I wanted to preprocess the content of the file with gsub ruby raised this Exception:

invalid byte sequence in UTF-8 (ArgumentError)

Now what to do?

There aren’t many attractive options. I googled around for quite some time but nobody seemed to have a reasonable solution so I tried wrapping a begin…rescue around each element I extracted with nokogiri and attempted to do the gsub clean up work there and if that would fail I would either try to guess the right encoding (ISO Latin or ASCII) or skip that element entirely. However even if that would have silenced the exceptions it would most likely be incorrect in the end.

My favorite option would be to simply ignore the bad sequences all together but I couldn’t find a proper way to do it in Ruby. After some time of googling I found this blog post describing how this can be achieved with Iconv

In case the blog post disappears I will write down the executive summary here:

ic = Iconv.new('UTF-8//IGNORE', 'UTF-8')
valid_string = ic.iconv(untrusted_string)

I guess this is useful outside the ruby universe as well. I hate encoding issues – I really do. Somehow though you end up with them in _every_ project.

How to setup SparkleShare beta2 for mac with your own server

Today the first SparkleShare beta for mac came out. SparkleShare is somehow very similar to dropbox but instead of being bound to one company, SparkleShare is based on git and therefor allows hosting your repositories on your own servers. I’ve been waiting for this release for because one of my colleagues is already using it on linux for quite a while.

Unfortunately the documentation is a little thin about self hosted repositories so here are the necessary steps to get it working:

  1. Download and install SparkleShare for Mac
  2. Create a bare git repository on your server and make sure you can connect to it using public key authentication
  3. Start up SparkleShare and add a new folder. In the address field add the address of your server to which you can log in to via public key authentication. In the folder field add the absolute path to the repository on the server.

Initially I tried to set it up via the built in setup assistant but for some reason an error occurred. I was able to log in and clone via the command line though. It turned out that I misinterpreted the address field because I was thinking that the usual ssh://user@server url would go in there.

You can also add a remote folder via the command line although that doesn’t seem to be recommended as mentioned in the comments.

  1. Open up a Terminal and cd to ~/SparkleShare
  2. Then clone your remote repository into that folder: git clone user@server/path/to/repo.git
  3. The folder should now appear in the SparkleShare menu and work as expected

Also make sure to have a working /usr/bin/git executable or symlink an existing git binary to that location otherwise SparkleShare will crash right away.

Parsing large log files with ruby

Currently I’m writing a couple of munin plugins at work and for certain components I need to parse log files to get out the numbers so munin can draw its graphs. My particular task is to parse a Pound log file and generate several custom graphs out of it and so the examples in this post will relate to this particular problem. I thought this might be useful anyway as text / log file parsing is quite a common task.

As you might know, munin asks its nodes every 5 minutes for new data and so the simplest approach would be to iterate over all the lines, trying to find the right ones. The first examples are a bit more general though. Instead of the last 5 minutes you can specify a start time and it will emit the next 5 minutes of log output.

For these experiments I concatenated log files from the last 5 days, starting with Feb. 8th which resulted in 218MB or 1.351.440 lines of text. Here is the code for the simple parser:

Here is how it is invoked:

time ruby parser_1.rb big.log "2011-02-09 01:10:00"
real	0m11.620s
user	0m11.551s
sys	0m0.066s

Note that I took a very early date and still it takes up 11 seconds to emit the right lines. This is far beyond reasonable. Even if I set the the starting time to 2011-02-08 20:00:00 it still takes 4.6 seconds at 100% cpu time. This is not something you want to run every 5 minutes on your production server.

The next approach is to use a state file. Every day after logrotate has done its duty, the logfile is small and the initial search is fast and cheap. On each run the last useful line number is written to the state file so that on the next run, this amount of lines can be skipped and only last 5 minutes of log output has to be searched instead of the entire log file over and over again.

This is the corresponding code which looks quite similar to the first approach:

It is invoked just like the first one and also the result for the first run is quite similar:

Here is how it is invoked:

time ruby parser_2.rb big.log "2011-02-09 01:10:00"
real	0m11.861s
user	0m11.789s
sys	0m0.069s

On the second run its quite faster though because of the line skipping:

time ruby parser_2.rb big.log "2011-02-09 01:10:00"
real	0m0.124s
user	0m0.096s
sys	0m0.025s

This considerably faster and almost useable for production. In other examples you see the use of gets() or readline() but in my benchmarks I couldn’t find any difference to the each_line() method in terms of performance. Also skipping on lines via the next keyword is fast and nice to read compared to lengthy if-clauses.

The next thing I tried was to remember the byte offset instead of the number of lines to skip. I already knew about the IO#lineno() method which would return the current line number of a given file but just recently I found out about thr IO#tell() method which returns the current byte offset. Maybe seeking byte wise would increase the speed some more I though – but unfortunately it didn’t.

First run:

time ruby parser_3.rb big.log "2011-02-09 01:10:00"
real	0m11.343s
user	0m11.296s
sys	0m0.054s

Second run:

time ruby parser_3.rb big.log "2011-02-09 01:10:00"
real	0m0.113s
user	0m0.101s
sys	0m0.009s

It turns out that seeking and skipping line wise is almost just as fast as byte wise. The time consuming part is the creation of Time objects and their comparison. As we all know, object creation is expensive in ruby and this is really the bottleneck here. So unfortunately this can’t be optimized in this case as I need to compare times. I had another log file where I could just compare Strings to filter for the proper lines of code and that was a lot faster but the apache common log timestamp doesn’t work for this.

After this insight I tried to optimize the search so that the first run would be a lot faster and wouldn’t take more than a second. Obviously searching line by line wasn’t the fastest way and so I tried to read in bigger chunks at once, matching for one of the correct timestamps and then narrowing it down to the the first line of the the time range. The lines of code increased for this example but the important stuff to look at is in the search method.

There is probably lots of room for making this a little shorter and more beautiful but it worked for now and I didn’t want to spend the whole day with this (which I did in the end). Anyway, how well does it perform?

First run:

time ruby parser_4.rb big.log "2011-02-09 01:10:00"
real	0m0.291s
user	0m0.255s
sys	0m0.032s

As you can see, the initial search is more than 35 times faster than the previous attempts. I experimented with the parameters and choosing a large buffer size increased the performance. Usually you find values like 4096 or 8192 bytes. I haven’t tried many combinations but somebody suggested to take a rather large value and it worked well enough.

Second run:

time ruby parser_2.rb big.log "2011-02-09 01:10:00"
real	0m0.173s
user	0m0.155s
sys	0m0.016s

Now the second run is slower than before but I suspect that the search implementation isn’t narrowing down to the exact byte and so the emit method has more lines left to go through. As mentioned before, I’m sure this can be optimized a little further.

The last step for me was to make it work for my munin task. In the end I’m just interested in the last 5 minutes of the log file and I don’t care how large it is and therefor searching backwards should boost the overall performance to a desirable degree and render the state file obsolete. Again this is prototype code and a little too simple when it comes to matching timestamps but nonetheless shows how effective this method is. My log file ends at 2011-02-12 09:10:00.

time ruby parser_5.rb big.log "2011-02-12 09:10:00"
real	0m0.043s
user	0m0.026s
sys	0m0.014s

Even without a state file the search and the output now only take 43 milliseconds. Best of all this class yields the lines to a block which is very useful for my munin modules.

The most valuable lessons for this came right from reading through the documention of IO (same with prettier template). Before this I had no idea about seeking through files in any direction. I learned about a couple of new useful methods and found a good use case for StringIO. So I encourage you to do the same! Lastly is also great to know that Ruby is capable enough for tasks like these.

On my way I also found a gem called file-tail which works like the command line tool “tail” but is written entirely in ruby. Its code is also a good source of inspiration (and probably better factored than my prototyping code).

Please let me know if you have additions, corrections or further optimizations for my examples! I’d really like to know about even better ways to do that.

[UPDATE]

After having another look at the task at hand and considering the hints from the comments I wrote another version today which is a lot shorter again, requires less dependencies and is executing very efficiently. This is because I was able to get a more usable timestamp in the log file and I’m now able to compare strings instead of times. So here is the code:

I made a gem out of this which isn’t really that useful for the general public at the moment as its not really that flexible.

DISCLAIMER: I have an SSD – so accessing and seeking files of that size may take longer on regular disks although tests on our server do not suggest a meaningful slowdown. Also, my test log file is way larger than it ever gets in reality thanks to logrotate. I just wanted to test it on a really large file.

[UPDATE]

If you have to parse lots of timestamps it can slow down your code considerably. There is however a ruby gem called home_run which claims to be 20-200 times faster than Rubys own Time class.