It seems that most people think Ensembl’s GTF file and cDNA fasta file mean the same transcripts: Watch out! @ensembl's Fasta and GTF annotation files available via https://t.co/2AhCSnL7py do not match (there are transcripts in the GTF not found in the Fasta file. Anyone else expected them to match? — K. Vitting-Seerup (@KVittingSeerup) August 13, 2018 However, my colleagues Joseph Min and Sina Booeshaghi found that for several species, Ensembl’s GTF file and cDNA fasta file do not have the same set of transcripts, so it would not be the same using the cDNA file as opposed to extracting the transcript sequences from the genome with the GTF file for a reference to pseudoalign RNA-seq reads.

Continue reading

This post is about my new R package IslamicArt, which provides color palettes inspired by Islamic art. Disclaimer: While I accept the Islamic theology, ethnically speaking, I’m not from the Middle East, North Africa, Central Asia, South Asia, or Southeast Asia. However, I do deeply appreciate art and philosophy from what’s conventionally known as the Islamic world as well as Sufism. This R package is only about colors, not about theology.

Continue reading

R blogs I follow

This page is about R resources. I also have a list of resources about dialogues between science and religion. One of my favorite aspects of R is the vibrant R community. A way to learn from the community - new tools, cool efficient tricks, something to note about data analysis - is reading blog posts. Actually, I learnt parallel programming in R entirely from blog posts. These are the R blogs I follow:

Continue reading

This quarter, I’m TAing my adviser’s class on computational biology. Though I have taken this class a year ago and got an A, TAing really deepened my understanding of the course material, much of which I have long been using routinely without thinking, such as principal component analysis (PCA). On the midterm, there was a problem asking students to give an example of 8 points in \(\mathbb{R}^2\) that do not have a unique 1 dimensional principal component projection.

Continue reading

In September 2015, as I started working in a lab that requires bioinformatics skills, I made a new friend whose name is R. Before then, the last time I programmed was in 2008, in C, and I didn’t do well in it. Then R has become my de facto mother tongue in programming. Three years later, I’m writing a package for single cell RNA-seq to be submitted to Bioconductor, and I have fixed bugs in other packages.

Continue reading

I have previously written about making the iconic Lorenz attractor animation with plotly; see that previous post for what the Lorenz system is. In the UseR! conference this year, Thomas Lin Pedersen presented the brand new version of gganimate which implements a grammar of animation, much like the grammar of graphics in ggplot2. In the older version by David Robinson, animation was made by adding an aes called frame. Now it’s just like adding geom_*s, scale_*s, stat_*s, and etc.

Continue reading

This is the first post in this blog. ## [1] "Hello World!" Once for a class assignment, we were asked to control the Lorenz system. The instructor recommended us to use MATLAB for assignments, but since I’m inexperienced in MATLAB, I decided to use R to do the assignments, and used the package plotly to make interactive 3D plots of phase portraits1 of the Lorenz system.

Continue reading

Author's picture

Lambda Moses

Monotheist, Aspie, R lover, advocate for constructive dialogues between science and religion, studying computational biology at Caltech

graduate student

Los Angeles