Dark Light LightMotif Stars#

A lightweight platform-accelerated library for biological motif scanning using position weight matrices.

Actions Coverage License Docs Crate PyPI Wheel Bioconda Python Versions Python Impls Source Mirror Issues Changelog Downloads

Overview#

Motif scanning with position weight matrices (also known as position-specific scoring matrices) is a robust method for identifying motifs of fixed length inside a biological sequence. They can be used to identify transcription factor binding sites in DNA, or protease cleavage site in polypeptides. Position weight matrices are often viewed as sequence logos:

https://raw.githubusercontent.com/althonos/lightmotif/main/docs/_static/prodoric_logo_mx000274.svg

The lightmotif library provides a Python module to run very efficient searches for a motif encoded in a position weight matrix. The position scanning combines several techniques to allow high-throughput processing of sequences:

Constant alphabets

Compile-time definition of alphabets and matrix dimensions, allowing constant pointer incrementation in loops over strided arrays.

Ordinal encoding

Sequence symbol encoding as indices fo fast table look-ups, as implemented in HMMER or MEME.

Striped matrices

Striped sequence matrices for parallel processing, inspired by Michael Farrar (PMID:17110365).

Platform code

Vectorized matrix-row look-up using permute instructions of AVX2.

This is the Python version, there is a Rust crate available as well.

Setup#

Run pip install lightmotif in a shell to download the latest release and all its dependencies from PyPi, or have a look at the Installation page to find other ways to install the lightmotif Python package.

Library#

License#

This library is provided under the open-source MIT license.

This project was developed by Martin Larralde during his PhD project at the European Molecular Biology Laboratory in the Zeller team.