Sunday, February 17, 2013

Uploading files via email

The following article describes a method for posting new documents on a website. Or more generally, uploading files to a server for further processing. I often create white-board pictures and annotated view graphs in class, and need to post them on the class web site. I found the CamScanner iPhone app particular useful to take pictures of the whiteboard. The app finds the edges of the white-board, crops the image and runs keystone correction and other image processing algorithms to enhance the picture. The other tool is Notability on the iPad, that I use to annotate my view graphs in class. (I prefer writing on my iPad over using SmartBoard with its tedious notebook software: the hand writing has to be so big that one can get hardly anything on the board.)

The majority of these iOS app have a number of ways to get your documents of the device, including Dropbox, Google Drive, and even built-in HTTP servers. However, I choose email because it will also support our departmental document scanner. Furthermore, my inbox fills up daily with announcements of workshops, internships, and other opportunities that I would like to post on my site. There wouldn't be anything easier than hitting the forward button.

Technically, I could email my documents directly to the server. However, enabling sendmail brings a whole bag of responsibilities with it, and negotiating a port 25 with the IT authorities doesn't seem to be worth the trouble. Instead, the described methods use an external, publicly accessible email server, like GMAIL. For my project I setup a dedicated GMAIL account, though, one could also use once regular account, and fetch emails from a particular folder (or label).

To get started, one needs a Linux box, fetchmail, procmail, and the nmh package. These should be available in every Linux distribution; in many cases they're already installed.

The basic fetch mail configuration is explained in http://www.daemonforums.org/showthread.php?t=5590, this blog http://badcherry.wordpress.com/2006/03/30/fetchmail-without-sendmail/ shows how to get around the sendmail daemon.

Here's the setup for Centos 5:
  1. Install the packages:
    
    
    $ yum -y install fetchmail procmail nmh
    
    
    
  2. Create user account under which the emails will be processed. I wouldn't use my regular user account, but it's possible to use the same account. In this example, the user account is "adriaan"
  3. Create a .fetchmailrc file to test the connection to GMAIL
    
    
    poll imap.gmail.com protocol IMAP 
       user "xxxxxxx@gmail.com" is adriaan here
       password 'mysecretpassword'
       fetchlimit 1
       keep
       ssl
    

  4. Test with
  5. 
    
    $ fetchmail -v -m '/usr/bin/procmail -d adriaan'
    
    
    
  6. When everything works, we change the script to:
  7. poll imap.gmail.com protocol IMAP 
       user "xxxxxxx@gmail.com" is adriaan here
       password 'mysecretpassword'
       fetchlimit 1000
       ssl
    
    The fetchlimit may prevent disaster if suddenly too many emails come to this account. We removed the "keep" option. From now on, mails will be removed from GMAIL. By default, the fetchmail program only load unread messages.
  8. The next step is creating a script for downloading (and processing) the emails. The script could look something like this /home/adriaan/bin/getProcessMail:
    
    
    #!/bin/bash
    #
    fetchmail -m '/usr/bin/procmail -d adriaan' 
    inc -file /var/spool/mail/adriaan -truncate +inbox
    # this is just collecting ... need to process ...
    
    
  9. The MH tools will be used to separate email messages into individual files. There are even tools to extract attachments. Having the email messages in separate files makes processing them easier. However, one may consider deleting the files ones their content has been processed. In order to use MH for the first time, run the command:
    $ install-mh
    
  10. We need to run this script every ten minutes. Use the crontab -e command to edit the user's cron-table. Add the following line
    
    
    */10 * * * * /home/adriaan/bin/getProcessMail.sh
    

Now, the email messages will be automatically saved on our system, and we're ready to process them. Everybody could send emails to the account. If this is not desired, the processing script may first check the sender's address, and dismiss all messages that didn't originate from a list of approved senders. Alternatively, one could achieve the same with GMAIL's mail filters.
The MH (http://www.nongnu.org/nmh/package) has a number of tools to deal with the messages, headers, and attachments.

No comments:

Post a Comment