LightMotif
#
A lightweight platform-accelerated library for biological motif scanning using position weight matrices.
Overview#
Motif scanning with position weight matrices (also known as position-specific scoring matrices) is a robust method for identifying motifs of fixed length inside a biological sequence. They can be used to identify transcription factor binding sites in DNA, or protease cleavage site in polypeptides. Position weight matrices are often viewed as sequence logos:
The lightmotif
library provides a Python module to run very efficient
searches for a motif encoded in a position weight matrix. The position
scanning combines several techniques to allow high-throughput processing
of sequences:
Compile-time definition of alphabets and matrix dimensions, allowing constant pointer incrementation in loops over strided arrays.
Striped sequence matrices for parallel processing, inspired by Michael Farrar (PMID:17110365).
Vectorized matrix-row look-up using permute
instructions of
AVX2.
This is the Python version, there is a Rust crate available as well.
Setup#
Run pip install lightmotif
in a shell to download the latest release and all
its dependencies from PyPi, or have a look at the
Installation page to find other ways to install the
lightmotif
Python package.
Library#
License#
This library is provided under the open-source MIT license.
This project was developed by Martin Larralde during his PhD project at the European Molecular Biology Laboratory in the Zeller team.