Introduction
In recent years, advances in molecular technologies have empowered researchers with the ability to spatially and temporally characterize natural microbial communities without lab cultivation (Fuhrman, 2009). Mining and analyzing co-occurrence patterns in these new datasets are fundamental to revealing existing symbiosis relations and microbe-environment interactions (Chaffron et al., 2010; Steele et al., 2011). Time series data, in particular, are receiving more and more attention, since not only undirected but also directed associations can be inferred from these datasets.
Researchers typically use techniques like principal component analysis (PCA), multidimensional scaling (MDS), discriminant function analysis (DFA) and canonical correlation analysis (CCA)) to analyze microbial community data under various conditions. Different from these methods, the Local Similarity Analysis (LSA) technique is unique to capture the time-dependent associations (possibly time-shifted) between microbes and between microbe and environmental factors (Ruan et al., 2006). Significant LSA associations can be interpreted as a partially directed association network for further network-based analysis.
Studies adopting the LSA technique have shown interesting and novel discoveries for microbial communities (Paver et al., 2010; Shade et al., 2010; Beman et al., 2011; Steele et al., 2011). However current dataset scale has outdated the old script. To improve computation efficiency, incorporate new features, such as time series data with replicates, and make the analysis technique more accessible to users, we have re-implemented the LSA algorithm as a C++ extension to Python. We also integrated the new LSA tool set with the popular Galaxy framework (Goecks et al., 2010) for web based pipeline analysis.
Implementation
Figure 1.
The analysis workflow of Local Similarity Analysis (LSA) tools. Users start with raw data (matrices of time series) as input and specify their requirements as parameters. The LSA tools subsequently
Availability
- Download released standalone source code package at: https://bitbucket.org/charade/elsa/get/release.tar.gz and install. Look into the README.txt file within the package (also viewable from https://bitbucket.org/charade/elsa) for detailed installation information and others.
-
Source code access at: https://bitbucket.org/charade/elsa.
The python package is made open source for advanced users to pipeline the analysis or implement other variants. - *Not recommended*, as the VM now is only with an older version of eLSA. Download and use SunLab virtual box machine with pre-installed eLSA at: http://meta.usc.edu/softs/vbox/SunLab.vdi.tgz. Check the md5sum to ensure integrity of the file: http://meta.usc.edu/softs/vbox/SunLab.vdi.tgz.md5.txt. Look into the README.txt file viewable from https://bitbucket.org/charade/elsa for detailed information. But if you have a Ubutun VM already, it is relatively easy to install with above release and source packages.
Wiki
eLSA's Wiki pages have manuals, FAQs and other information that you MUST read before actually using the eLSA tool. They are openly editable. You are more than welcome to contribute to this ongoing documentation.
Notes
Contacts
Questions and comments shall be addressed to lxia at usc dot edu.
Citations
Please cite the references 1 and 2 if the eLSA python package is used in your study. Please cite the reference 3 if you used the old R script.- Li C. Xia, Dongmei Ai, Jacob Cram, Jed A. Fuhrman, Fengzhu Sun Efficient Statistical Significance Approximation for Local Association Analysis of High-Throughput Time Series Data Bioinformatics 2013, 29(2):230-237
- Li C Xia, Joshua A Steele, Jacob A Cram, Zoe G Cardon, Sheri L Simmons, Joseph J Vallino, Jed A Fuhrman and Fengzhu Sun Extended local similarity analysis (eLSA) of microbial community and other time series data with replicates BMC Systems Biology 2011, 5(Suppl 2):S15
- Quansong Ruan, Debojyoti Dutta, Michael S. Schwalbach, Joshua A. Steele, Jed A. Fuhrman and Fengzhu Sun Local similarity analysis reveals unique associations among marine bacterioplankton species and environmental factors Bioinformatics 2006, 22(20):2532-2538