10/18/03

Bio::SearchIO::Blink

This module retrieves a BLINK result from NCBI.

This is something I needed for what I was doing. The result is decidedly un-bioperl like, but it works

Bio::Tools::RepeatFinder

This module finds repeats in sequences. It will find direct and indirect, and perfect and imperfect repeats.

The module is not necessarily optimized for speed! The imperfect repeat part of it, especially, is really slow because it looks through all the perfect repeats and joins nearby ones together. However it works, and I have used it for sequences > 100kb, and repeats down to 7bp.

The "imperfect repeat" part of it just joins other repeats, so it will find AAAAAAAGAAAAAAA and AAAAAAACAAAAAAA but not AAAAAAAGAA and AAAAAAACAA

The best way to use it is to pass in a Genbank sequence and you'll get a GenBank sequence back with the repeats annotated.

Note: at the moment this module requires Tie::HashRef so that you can have references to hashes as keys and values of hashes. This is really only needed (in the current implementation) for the imperfect repeats.


Note: The modules below have been rolled into bioperl 1.3. Don't use these

6/21/03

All the modules have been combined into a single tarball.

The modules are:

There are also test scripts and a couple of example scripts


Old versions

OK, so there is not much here. First, here are the primer modules that I have been working on.


Next, there is version 0.3 of Bio::Tools::Analysis::Nucleotide::RestrictionEnzyme. This version contains several updates from the previous version (0.1):

There is an initial version of Bio::Tools::Analysis::Nucleotide::RestrictionEnzyme suite that I have been messing with. This is based on some bugzilla comments from Heikki Lehvaslaiho and Steve Chervitz. This sort of came together out of some other things I was messing with (rebase) and so I am not sure how good it is. There are three components:

  1. Bio::Tools::Analysis::Nucleotide::RestrictionEnzyme just describes a single enzyme
  2. Bio::Tools::Analysis::Nucleotide::RestrictionEnzymeAnalysis is for cutting sequences
  3. Bio::Tools::Analysis::Nucleotide::RestrictionEnzymeCollection is for adding restriction enzymes especially from REBASE. Specifically, at the moment it deals with one type of file, the All enzymes (individually referenced) with isoschizomers file. This one has all the data, so the others should be easy, right?

The tarball above contains 4 example scripts that take a sequence, generate some data, and spit out some results.

Large parts of these modules are just rearranged from Steve's early work, and he deserves and gets all the credit for that. I added a few things to make Heikki happy!

As per Heikki's suggestion, I called this suite Bio::Tools::Analysis::Nucleotide::RestrictionEnzyme which is damn long to type all the time, but it should, therefore still play with Bio::Tools::RestrictionEnzyme.


I have put v0.3 of the hackers guide to bioperl online. Not really a guide, more a useful list.


There are some scripts that I have written here