Novartis is pioneering new informatics tools for drug discovery.We believe in the power of open-sourced,global collaboration for the greater good.Join us to help patients worldwide。

Peax

Peax is a tool for interactive concept learning and exploration of epigenomic patterns based on unsupervised machine learning with autoencoders。

GitHub Project下行Peax


Jenkins-LSCI

Jenkins-LSCI enables research scientists to build workflows and data pipelines on the same robust framework and plugin ecosystem as Jenkins-CI the widely used continuous integration server that supports building,deploying and automating any software project。

GitHub Project


Habitat

Habitat is a simple and yet powerful self-contained object storage management system.Based on Amazon Web Services,it is capable of virtually unlimited storage.Instead of a large centralized management system,Habitat can be used asa local repository fora single application or it can be shared and used with many clies。

Habitat is best used for situations where the client producers and consumers of the files do not require a file system protocol interface and can use http(s)to access the store。

GitHub Project下舱口,下舱口 


YADA

Access any data,at any source,in any format,from any environment,using just a URL,with just one-time configuration。

Get data from multiple sources,in different formats,merge the results into one with uniform column names,on-the-fly,using one URL。

Its raisons d‘#tre are to enable efficient,non-redundant development of data-dependent applications and utilities,data source querying,data analysis,processing pipelines,extract,transform,and load(ETL)processes,etc.YADA does all this while preserving total decoupling between data access and other aspects of application architecture such asuser interface。

GitHub Project下行YADA


OntoBrowser

The OntoBrowser tool was developed to manage ontologies and code lists.The primary goal of the tool is to provide an online collaborative solution for expert curators to map code list terms(sourced from multiple systems/databases)to preferred ontology terms。Other key features include visualisation of ontologies in hierarchical/graph format,advanced search capabilities,peer review/approval workflow and web service access to data。

GitHub Project下载到浏览器


Railroadtracks

Railroadtracks is a Python toolkit to handle graphs of dependent tasks such as the ones found in bioinformatics pipelines。

It was created for comparing RNA-Seq pipelines and found its use is other situations,such as writing a flexible system for the QC of NGS data。

GitHub项目下行轨道交通工具Documentation(PDF0.5MB)


Yet Another Pipeline

YAP is an extensible parallel framework,written in Python usingOpenMPIlibraries.It allows researchers to quickly build high throughput big data pipelines without extensive knowledge of parallel programming.The user interacts with the framework through simple configuration files to capture analysis parameters and user directed metadata,enabling reducireble,分析have been able to achieve a significant speed up of up to 36×inRNASeqworkflow execution time。

YAP has been designed to be scalable and flexible.We have implemented YAP with a focus on next-generation sequencing(NGS),to meet the large data processing challenges at NIBR.However,the framework can be easily adapted for any kind of analysis.It can be executed on your local Linux workstations or large HPC cluster systems.The framework achieves efficiency by implementing optimal data handling mechanisms such as,paralledal, avoiding file I/O using data streams and named pipes。

GitHub项目下行年Another Pipeline


RDKit

The RDKit is a collection of cheminformatics and machine-learning software written in C++and Python.The core algorithms and data structures are written in C+.Wrappers are provided to use the toolkit from either Python,Java,orC#.Additionally, the RDKit distribution includes a PostgreSQL-based cartridge that allows to be stored in arelational database and retrieved via substructure and similarity searches。

Please see theRDKit Documentationfor more information on installation,usage,cookbooks,and lots more。

GitHub项目Download RDKit


GridVar

GridVar is a jQuery plugin that visualizes multi-dimensional datasets as layers organized in a row-column format.At each cell(i.e.,rectangle at the intersection of a row and column),GridVar displays your data asa background color(like a/heat map)and/or aglyph(shape)。This enables different characteristics of your dataset to be layered on top of each other.For more information on usage,required libraries,and other developer information,please see our documentation on GitHub。

GitHub项目下行GridVar