Wednesday, 13 November 2013

No FileSystem for scheme: hdfs

I recently came across the "No FileSystem for scheme: hdfs" error from a Scala application that I wrote to work with some HDFS files. I was bundling the Cloudera Hadoop 2.0.0-cdh4.4.0 libraries with my application and it turns out that the FileSystem service definition for HDFS was missing from the META-INF/services directory. The fix is as follows:

  1. Create a META-INF/services directory in src/main/resources and add a file named org.apache.hadoop.fs.FileSystem
    mkdir -p src/main/resources/META-INF/services
    touch src/main/resources/META-INF/services/org.apache.hadoop.fs.FileSystem
    
  2. Edit the org.apache.hadoop.fs.FileSystem file and add the following content. You can omit almost everything except the last line
    org.apache.hadoop.fs.LocalFileSystem
    org.apache.hadoop.fs.viewfs.ViewFileSystem
    org.apache.hadoop.fs.s3.S3FileSystem
    org.apache.hadoop.fs.s3native.NativeS3FileSystem
    org.apache.hadoop.fs.kfs.KosmosFileSystem
    org.apache.hadoop.fs.ftp.FTPFileSystem
    org.apache.hadoop.fs.HarFileSystem
    org.apache.hadoop.hdfs.DistributedFileSystem
    
  3. If you are using the Maven assembly plugin to create a fat jar, the following stanza should be added to assembly.xml inside the <assembly> tag
        <containerDescriptorHandlers>
            <containerDescriptorHandler>
                <handlerName>metaInf-services</handlerName>
            </containerDescriptorHandler>
        </containerDescriptorHandlers>
    

Saturday, 9 November 2013

Reducing the number of threads created by the ElasticSearch Transport Client

This is a quick snippet saved for posterity so that I won't have to pull my hair out the next time I run into this issue. To reduce the number of threads spawned by the ElasticSearch transport client (default is 2 x num_cores) add the following option to the settings object:
Settings settings = ImmutableSettings.settingsBuilder().put("transport.netty.workerCount",NUM_THREADS).build();

Saturday, 8 June 2013

Setting up dnscrypt on Fedora

DNSCrypt is a free service by OpenDNS that provides encrypted DNS lookups. If you are concerned about man-in-the-middle attacks, data collection/spying by various entities or ad injections by unscrupulous ISPs, encrypting your DNS lookups is a good starting point. Bear in mind that just encrypting your DNS lookups will not make you secure online. It has to be used in conjunction with a lot of other tools and services if you really want to safeguard your privacy.

  1. Download the DNSCrypt tarball from http://download.dnscrypt.org/dnscrypt-proxy/ . At the time of writing, the latest version was dnscrypt-proxy-1.3.0.tar.gz
  2. tar xvf dnscrypt-proxy-1.3.0.tar.gz && cd dnscrypt-proxy-1.3.0
    ./configure
    make -j4
    sudo make install
    
  3. Create a new system user to run the service:
    sudo adduser -m -N  -r -s /bin/false dnscrypt
  4. Now start the service in the foreground to make sure everything is working:
    sudo dnscrypt-proxy -u dnscrypt
  5. Change your system DNS server to 127.0.0.1. There are many ways to do this. The adventurous can edit the appropriate script in /etc/sysconfig/network-scripts/. If you don't have NetworkManager installed, editing /etc/resolv.conf would work too. Gnome users: click on the network icon, click 'Network Settings', select the connection and click 'Options'. Then in the 'IPv4 Settings' tab, set the 'Method' to 'Automatic (DHCP) Addresses Only' and type in 127.0.0.1 in the 'DNS Servers' text box.
  6. Restart network service for the DNS server changes to take effect.
    sudo systemctl restart network.service
  7. Now you can verify that the changes have taken effect by running dig google.com and checking the output for the line: SERVER: 127.0.0.1#53(127.0.0.1). Alternatively, navigate to http://www.opendns.com/welcome/ using a web browser. The screen will tell you whether you are using OpenDNS or not.
To run the dnscrypt-proxy service on system startup, create a systemd service as follows:
  1. As root, create the file /etc/systemd/system/dnscrypt.service with the following content:
  2. Refresh the system daemon:
    sudo systemctl daemon-reload
  3. Now the dnscrpyt service will start automatically on every boot. You can manually start or stop the service by issuing the usual systemctl commands as well.
    sudo systemctl start dnscrypt.service

Saturday, 9 February 2013

Mounting a Nexus 7 on Fedora

If you need to transfer several gigabytes of data between your computer and the Nexus 7 tablet, the fastest option is to mount the N7 as a MTP file system. One way to achieve this is through the mtpfs tool which can be found in the Fedora repositories. However, I found it to be buggy and slow. The better option, as mentioned in Linux Format 165, is to use jmtpfs. The installation instructions on the source were a bit out-of-date so here's how to compile jmtpfs:

sudo yum install libmtp-devel fuse-devel file-devel
git clone https://github.com/kiorky/jmtpfs.git && cd jmtpfs
./configure && make
sudo make install

Once installed, mounting the N7 is a breeze. Use the USB cable to connect the tablet to the computer and then create a directory where you want to mount it.
mkdir Nexus
jmtpfs Nexus

To unmount, use the command:
fusermount -u Nexus