Python: Query reference sequences using rpy2 and BSgenome
Usually this job is done in R, as deescribed in Using R + Bioconductor to Get Flanking Sequence Given Genomic Coordinates. An R sample could be this simple:
Usually this job is done in R, as deescribed in Using R + Bioconductor to Get Flanking Sequence Given Genomic Coordinates. An R sample could be this simple:
之前遇到过这么一个例子:
一个有点鸡贼的方法:
Just take a look at an exemplar output and you’ll know what they are designed for.
需要注意的是:one-hot encoding 应该是 binary encoding 的特殊情况。one-hot 的意思就是:只有一个 bit 是高位(1);类似还有 one-cold encoding
1. Introduction to Information Theory
From stackoverflow: What do the terms “CPU bound” and “I/O bound” mean?:
Greatest thanks to 3Blue1Brown!
GWAVA, Genome Wide Annotation of VAriants, is a tool aiming to predict the functional impact of non-coding genetic variants. It consists of two parts: