Installing RepeatMasker

04 Nov 2016

Today I started working on identifying transposable elements in the Ostrea lurida genome via RepeatMasker.

RepeatMasker is a linux based program that identifies transposable elements, satellites, and regions of low DNA complexity from an existing .FASTA file. The program can be found at http://www.repeatmasker.org/RMDownload.html

After downloading the requisite files and dependencies (For me this was RMBlast, TNF, and the repeat database from GIRI, which requires user authentication provided by the GIRI people, a roughly 3 hour turnaround time for me). Dtrx can be gotten from your favorite package manager, ex. apt-get install dtrx

Installation Walkthrough:

note: This was done on a Macbook Pro running Ubuntu 16.04 roughly following repeatmasker.org instructions.

  1. Check perl version, as RepeatMasker requires version 5.8.0 or greater.

    perl version check

  2. Unzip RepeatMasker in to /usr/local.

  3. Unzip blast in to /usr/local

  4. Unzip rmblast in to /usr/local

  5. copy rmblast files in to the blast director

  6. At this point, I switched to using dtrx (do the right extraction) as opposed to using tar/gunzip. Just a quality of life thing. I unzipped the repeat libraries obtained from GIRI at this point. note: Be better than I am and use the force overwrite (-o) argument for dtrx, RepeatMasker comes with a Library directory, so dtrx will make a second Libraries directory as opposed to overwrite without that argument. The subsequent pictures show how to fix that.

  7. Updating Dfam libraries, the picture below shows dtrx with both the -o and without the -o argument. I took the time to delete the extra copy of the dfam library for cleanliness. Note: I’m not sure this step was necessary for me, as the Dfam libraries are only human related transposable elements.

  8. Config time! There’s a few little traps, but overall it went smoothly.

  9. First screen, just enter through.

  10. Path to perl interpretar, default worked for me.

  1. Repeat Masker installation location, again default worked for me.

  1. Next, it asks for the TRF installation location, and an interesting bug came up. TRF has a file name “trf.linux64”, but RepeatMasker expects a file name of just “trf” when inspecting the directory given.

  1. Initially, I tried to rename the file, that was a no-go, so I set up a symlink (symbolic link) to fool RM into thinking TRF existed as it expected.

  1. Next step asks for Search Engine location, and if you’d like it to be the default.

  1. Config done! Err… not so much. The configuration removes some files, which requires sudo as they live in /usr/local.

So, just rerun the config using sudo perl ./configure. 
  1. Success! and a new error. Apparently we were missing a required perl module. Easy fix as follows.

  1. Success for real!