Ontario Institute for Cancer Research (OICR) is one of the nominees for the Superuser Awards to be presented at the OpenStack Summit Vancouver, May 21-24.

image

It’s time for the community to help determine the winner of the OpenStack Vancouver Summit Superuser Awards, sponsored by Zenko. Based on the community voting, the Superuser Editorial Advisory Board will review the nominees and determine the finalists and overall winner.

Now, it’s your turn.

OICR is one of seven nominees for the Superuser Awards. Review the nomination criteria below, check out the other nominees and rate the nominees before the deadline Tuesday, April 24 at 11:59 p.m. Pacific Time Zone.

Cast your vote here!

Who are the team members?
George Mihaiescu (cloud architect)
Jared baker (cloud engineer)
Francois Gerthoffert (project manager)
Rahul Verma (software developer)
Robert Tisma (software developer)
Dusan Andric (software architect)
Vincent Ferretti (principal investigator)
Lincoln Stein (principal investigator)
Junjun Zhang (Senior Bioinformatics Manager)

Site: www.cancercollaboratory.org

How has open infrastructure transformed your organization? 

OpenStack made it possible for OICR to build the Cancer Genome Collaboratory, a cloud that enables research on the world’s largest and most comprehensive cancer genome dataset.

Researchers can run complex analysis across a large repository of cancer genome sequences. Instead of spending weeks to months downloading hundreds of terabytes of data from a central repository before computations can begin, researchers can upload their analytic software into the Collaboratory cloud, run it and download the computed results in a secure fashion.

The Collaboratory is home to the data holdings of the International Cancer Genome Consortium (ICGC), a global collaboration involving more than 40 countries/jurisdictions to sequence the genomes across 50 major cancer types. Users of the Collaboratory have fast and easy access to this unique data set.

How has the organization participated in or contributed to an open infrastructure community? 

OICR is hosting the Toronto OpenStack meetups in 2018. George Mihaiescu and Jared Baker have presented at past OpenStack Summits (Barcelona, Boston and soon Vancouver) as well as reporting bugs and providing feedback on the mailing list and IRC channels.

Mihaiescu has been involved with OpenStack since the Cactus release, with the first conference attended being in Boston 2011 and active in the OpenStack scientific working group, as well as a member in the OpenStack Day Canada 2018 organizing committee.

The team also writes blog posts on http://softeng.oicr.on.ca/, sharing knowledge of open-source technologies.

What open source technologies does the organization use in its IT environment?

All the technologies used in Collaboratory are open source, including: Linux, OpenStack, Ceph, Ansible, Zabbix, Elasticsearch, Logstash, Kibana, Grafana, MaaS, ARA.

We are proud to always choose open source alternatives for our technology. We benefit from the flexibility and freedom provided and using the cost savings to invest in increased capacity for cancer research. Most importantly, we couldn’t offer cancer researchers a cloud environment at this scale and price point if we weren’t using open-source technologies like OpenStack and Ceph.

What is the scale of the OpenStack deployment? 

The Collaboratory has 2,600 cores and 18 terabytes of RAM, as well as 7.3 petabytes of storage managed by Ceph. More than 40 research labs and 95 researchers across four continents are using Collaboratory to access the 670 terabytes of protected cancer genome data stored.

The Collaboratory has contributed to the research underlying 43 peer-reviewed papers, with an additional 50 papers from the Pan-cancer project currently in preparation or review.

The genomic cancer research workloads are special in their needs, so the Collaboratory provides very large flavors to accommodate the storage, CPU and memory requirements. Because of its use of only open source and commodity hardware, the cost for using the Collaboratory is almost 40 percent less than that of the leading commercial cloud provider.

What operational challenges have you overcome during your experience with open infrastructure? 

We initially deployed OpenStack on the Juno release and upgraded live multiple times all the way to the Ocata release, with a planed upgrade to Pike in the weeks to come.

We also live upgraded the operating system from Ubuntu 14 to 16 and Ceph from Hammer to Jewel. From packaging issues to documentation mistakes, we gained a lot of operational experience over time managing OpenStack and Ceph.

With a infrastructure support team of just two people, we have to stay close to the latest OpenStack version and careful of the projects supported in our environment while keeping the environment stable and secure.

Both infrastructure support people are OpenStack certified (COA) and spend considerable time researching and testing new OpenStack releases and features.

How is this team innovating with open infrastructure? 

The Collaboratory team has developed two open-source applications for our users:

– A cost-recovery application providing the principal investigators with daily usage and cost metrics.

– An enrollment application, allowing researchers to request OpenStack tenants to be created, or users to be added to existing tenants.

Also, we developed and open sourced metadata and storage software used to provide granular and time-limited access to the protected data sets only to approved researchers.

Another open-source project developed at OICR is called Dockstore and its goal is to allow researchers to package and share bioinfomatics tools and workflows.

How many Certified OpenStack Administrators (COAs) are on your team?

Two.

Voting is limited to one ballot per person and closes Tuesday, April 24 at 11:59 p.m. Pacific Time Zone.

Superuser