The paperless office with Linux

April 10th, 2012 robin Posted in Reviews | No Comments »

A few days ago I took delivery of a used Fujitsu ScanSnap S1500 (currently about 400€ new, I got mine for 235€ on eBay), and started on the long job of making my home office paperless.

xsane

The best news: it works out of the box with Linux (Ubuntu 11.10).  Just install xsane as scanning software and you’re running.  xsane is great for custom scanning (where you want some colour, some higher resolution (the scanner does up to 600dpi), lineart, duplex …).

scanbuttond

…but for your run-of-the-mill office scanning, you probably just want grey at 150dpi with fairly high compression (comes out at ~150kb per pdf page), and the ability to whack in stacks of paper and just keep on hitting the “Go” button.  For this I installed scanbuttond and sane-utils on my home server (a little box under my table) and put together a little buttonpressed.sh script so every time the button is pressed, it creates a pdf in a network shared folder (which has the advantage that I can access my scanned documents from any computer in the house, and even scan without turning my desktop on!)

#!/bin/bash
OUT_DIR=/mnt/raid/scan
TMP_DIR=`mktemp -d`
cd $TMP_DIR
echo "################## Scanning ###################"
scanimage \
 --resolution 150 \
 --batch=scan_%03d.tif --format=tiff \
 --mode Gray \
 --device-name "fujitsu:ScanSnap S1500:7739" \
 -y 297 -x 210 \
 --page-width 210 --page-height 297 \
 --sleeptimer 1
echo "############## Converting to PDF ##############"
#Use tiffcp to combine output tiffs to a single mult-page tiff
#tiffcp -c lzw scan_*.tif output.tif
tiffcp scan_*.tif output.tif
#Convert the tiff to PDF
tiff2pdf output.tif -j -q 60 -p A4 > $OUT_DIR/scan_`date +%Y%m%d-%H%M%S`.pdf
cd ..
echo "################ Cleaning Up ################"
rm -rf $TMP_DIR

I took much of the inspiration for the script from this article, which also uses tesseract for OCR, but that just makes a separate text file with the recognised text… I don’t like that, so I’m still looking for a way to embed the detected text into the pdf

As you can see I had to hard code the scanner name because scanbuttond (last updated in 2006…) passes the device address, but the current version of scanimage needs the device name as given by scanimage -L , so they’re not really compatible with each other any more… :-/

I’ve also set it such that all pdfs will be A4, and like I said earlier, only 150dpi, and pretty lossy jpeg compression – that’s my default preference, YMMV.

The S1500 in detail

Now a little about the scanner itself.  It’s about the size and weight of a compact inkjet printer (or a cat).

The fold-in/out mechanism is pretty easy, so I think even though I’m ultimately lazy, I might even flip that shut when I’m done to dust protect it.  Other than that there’s not much to say… it has one button: bright blue… I know some people may have a fit at that…).  It comes with a 240VAC to 24VDC adapter, and a usb cable.  The paper feed opens with a little button on the right, and the insides are readily understandable and cleanable.  Did I mention it has two scanning heads, so it does duplex? :)

De-papering my office

My first task with the scanner was to scan in my business receipts from last year – that’s about 300 items, but many of the smaller receipts (bus tickets etc.) are pasted onto A4 sheets (many to a sheet).  It took me about 30 minutes to scan the lot, including the time to remove any staples or clips (a must!), and a few paper jams.  I have no idea how long that would have taken with my old flatbed…

The s1500 isn’t resistant against paper jams, but I was surprised to see it handled all the worst sheets (lots of different receipts pasted to one sheet) easily, and only had difficulty with the recycled paper we use where the individual sheets stick a bit more to one another because of the rougher surface.  With a bit of practice fanning the sheets, this isn’t much of an issue either, but you do have to keep an eye on it as it’s scanning to be sure it got each individual sheet

All in all, I’m very happy with the decision, and am looking forward to shredding and archiving lots of paper out of my office!

 

AddThis Social Bookmark Button

Moving to Shotwell

February 10th, 2012 robin Posted in Reviews | No Comments »

I just spent a few days getting my old media archives in order.

I’ve previously used F-Spot for managing my photos, and errr… folders for videos. Unfortunately F-Spot development seems to be stagnating, and I’m not too impressed with the performance and frequency of crashes…
Sooo I opened up Shotwell (the (not so) new default photo manager on Ubuntu), and proceeded to import all my photos. I can’t say it went without a hitch – I had to fiddle around with a few batches of photos which got tags I never defined hooked on them in a way Shotwell wouldn’t remove, and some other tags just dissapeared altogether… but all in all the effort for migrating ~25k photos was acceptable, and the end result: Shotwell is pretty sweet!
Shotwell is soooo much faster and responsive than F-Spot, and I love the way that any enhancements you make on a picture (cropping, colour/hue adjustment etc.) are stored as transformations in the database, and only applied to exported pictures – the original image data remains untouched. The F-Spot solution to this was to create Modified versions, but that always bugged me having so many versions of the same file hanging around, and progressive loss of quality…
Also I love that tagging in Shotwell is much faster than in F-Spot (for a keyboard junkie like me): [ctrl]-t, and type in (auto-complete) a list of tags.
For video editing (I recently got a camcorder) I tried out Kino (no development in last year and a half, and working with .MOD files was a pain) and Cinalerra (horrible GUI…) but ended up settling for OpenShot, and while it probably can’t be considered a fully fledged video editor, it’s more than enough for my needs, and pretty intuitive.

AddThis Social Bookmark Button

Perl interface to Google Directions API

February 1st, 2012 robin Posted in google, Ironman, Perl, Travel | No Comments »

Google has a pretty neat service for getting driving/cycling/walking directions between places, and now there’s a perl interface to it: Google::Directions (and on GitHub)

It’s Moosey and it’s juicy… I hope it helps you get from A to B with Perl a bit faster! :)

If anybody is top-fit with Moose::Util::TypeConstraints, I have some issues in this module which I don’t understand and would really appreciate some tips with. :)

And here’s some sample code:

#!/usr/bin/env perl
use strict;
use warnings;
use Google::Directions::Client;
use Getopt::Long;
my %params;
GetOptions( \%params,
 'origin=s',
 'destination=s',
 );
my $goog = Google::Directions::Client->new();
my $response = $goog->directions( %params );
my $first_leg = $response->routes->[0]->legs->[0];
printf( "That journey will take you %0.1f minutes and %0.1fkm\n",
 ( $first_leg->duration / 60 ),
 ( $first_leg->distance / 1000 ),
 );
# $> ./test_directions.pl --origin "Munich, Germany" --destination "Hamburg, Germany"
# That journey will take you 443.8 minutes and 778.5km
AddThis Social Bookmark Button