LightMotif ¶
A lightweight platform-accelerated library for biological motif scanning using position weight matrices.
Overview¶
Motif scanning with position weight matrices (also known as position-specific scoring matrices) is a robust method for identifying motifs of fixed length inside a biological sequence. They can be used to identify transcription factor binding sites in DNA, or protease cleavage site in polypeptides. Position weight matrices are often viewed as sequence logos:
The lightmotif
library provides a Python module to run very efficient
searches for a motif encoded in a position weight matrix. The position
scanning combines several techniques to allow high-throughput processing
of sequences:
Compile-time definition of alphabets and matrix dimensions.
Sequence symbol encoding for fast table look-ups, as implemented in HMMER or MEME
Striped sequence matrices to process several positions in parallel, inspired by Michael Farrar.
Vectorized matrix row look-up using
permute
instructions of AVX2.
This is the Python version, there is a Rust crate available as well.
Setup¶
Run pip install lightmotif
in a shell to download the latest release and all
its dependencies from PyPi, or have a look at the
Installation page to find other ways to install the
lightmotif
Python package.
Library¶
License¶
This library is provided under the open-source MIT license.
This project was developed by Martin Larralde during his PhD project at the European Molecular Biology Laboratory in the Zeller team.