Canadian Institutes of Health Research
Français Contact UsHelpSearchCanada Site
CIHR HomeAbout CIHRWhat's NewFunding OpportunitiesFunding Decisions
CIHR | IRSC
CIHR Institutes
IG Home
About IG
IG Funding
IG Publications & Resources
Annual Reports
Workshop Reports
Commissioned Reports
Archived News
IG Calendar of Events
Contact IG
 

Institute of Genetics (IG)

Bioinformatics Workshop, Aylmer (Quebec) - Report of the Proceedings, September 19th, 2001

GENOME CANADA &
INSTITUTE OF GENETICS

CANADIAN INSTITUTES OF HEALTH RESEARCH

Bioinformatics Workshop

Aylmer (Quebec)

REPORT OF THE PROCEEDINGS
September 19th, 2001

BACKGROUND

Purpose
The Bioinformaticse Workshop was jointly sponsored by Genome Canada and the CIHR Institute of Genetics. This strategic planning workshop was held in Aylmer, Quebec at the Château Cartier, on September 19th, 2001. Approximately 55 individuals from a variety of institutions and organizations in the Bioinformatics community were assembled to participate in this workshop with the following two objectives:

  1. To develop a long-term strategy for moving Canadian research forward in this field, taking into consideration mutual opportunities (national and international), needs, and challenges for stakeholders.
  2. To devise a strategy to ensure there are proposals in this field ready to submit to Genome Canada for the December 2001 competition and to the CIHR Institute of Genetics for the 2002 competition.

Steering Committee
The sponsors identified a steering committee to guide the planning process for the workshop over the summer months of July and August 2001. During that time, the steering committee held four teleconferences and almost daily electronic communications focused on the following issues:

  1. Where will the next breakthrough be required in this field to move forward in dramatic fashion?
  2. If we could ignore current limitations (time, money, personnel, knowledge and skills) what questions/ issues would we like to work on?
  3. What are the unique skills, research issues, challenges and opportunities that we
    3.1.have in Canada, or
    3.2.do not, and should have in Canada?
  4. How could the bioinformatics community in Canada collectively solve the current critical need for bioinformatics support to the primary data-production aspects of the genomic studies funded by Genome Canada and others?
  5. Identification of workshop participants

Discussion within the Steering Committee over the two months confirmed commitment to the following principles:

  1. That a dependable, comprehensive and integrated resource must be created to support the biology-led genomic projects funded as a part of Genome Canada or other funders. Coordination is required to optimize efficiency and throughput and minimize duplication of effort and investment. This support has two critical aspects:
    • Service which will involve varying levels of bioinformatics support from simple provision of software through to the development of highly novel IT, consortia to develop generic data integration tools, standardization and data banking protocols.
    • Consulting support on a range of topics and issues that will emerge as the above work proceeds within the biology-led projects.
  2. A significant training component must also be addressed in a coordinated fashion in order to produce the bioinformatics-knowledgeable personnel at a range of levels required by the biology-led projects. These personnel will range from re-trained mature scientists to undergraduate students from across the sciences.
  3. In order to attract the best international scientists to lead and/ or take part in the above support commitments, it will be necessary to champion and contribute significantly to their bioinformatics research endeavors and aspirations.

The Steering Committee identified the following possibilities, but realized that these are not mutually exclusive and that aspects could be combined in different useful combinations. They also realized that there are many other potentially workable ideas not yet identified. All agreed however that clear direction is needed from Genome Canada about preferred directions and available support to achieve the goals involved.

  1. Continue the science-driven competitive model established in the first Genome Canada funding cycle. This alternative would focus funding consideration on individual investigator-led projects: for example BIND/BLUEPRINT, CyberCell, GoBase, MAGPIE/ BLUEJAY.
  2. Directly fund the enabling bioinformatics functions of service, consulting and training. Build these in a cooperative national fashion on existing programs such as the Canadian Bioinformatics Resource (CBR) and the Canadian Bioinformatics Workshops (CBW).
  3. Facilitate the development of a coordinated structure to support service, consulting, training and bioinformatics research. For example: site the coordination responsibility for bioinformatics in one regional genomic centre and fund that centre to coordinate service, training, consulting with the requirement that these supports be provided and available on site in all the other regional centres.
  4. Facilitate the identification of a bioinformatics research niche for Canada. At least four initial themes were identified as worthy of further discussion:
    • biological data integration at multiple scales of time and space
    • integration of functional genomic data
    • exploiting the existence of sequence information coupled with easier access in Canada to clinical data
    • imaging, and simulation of biological systems

Context
Bioinformatics can be defined as the development and use of computational and mathematical methods for the acquisition, archiving, analysis and interpretation of biological information to determine biological functions and mechanisms as well as their applications in user communities.

An important achievement to the future of bioinformatics is the development of GRID capability. The GRID is an advanced computational concept that will provide the next generation worldwide web (www). The major differences between the GRID and the www are found in the increased computing power available, the increased volume of data that can be handled and the speed with which data can be transferred between nodes on the GRID. The GRID will also provide vast capacity to store and retrieve data from a variety of sources and will allow the presentation of data obtained in the same format, regardless of its source. GRID capability is being initiated in Canada by the NRC.

Biological science is now one of the major areas of focus of scientific research worldwide. This effort is increasingly being channeled through large interdisciplinary teams working on specific problems of significant biological interest with direct industry application (i.e. in pharmaceuticals, healthcare, biotechnology and agriculture). The increasing amounts of data being generated by these groups are often complex: the human genome is complicated, all of the expression experiments are difficult and the 3D structures are intricate. Additionally, these data are generated through different media, are variable in quality, stored in many places, difficult to analyze, often changing and mostly comprised of incomplete data sets. Support for database maintenance will be crucial. Learning to harness and exploit these data is a major challenge for bioinformatics and will also impact directly on the success of Genome Canada.

CANADIAN ADVANTAGES IN BIOINFORMATICS

Based on conversations with Steering Committee members and a review of available literature, the following appear to be Canadian advantages to be exploited by Canadian bioinformatics in defining a Canadian niche:

  1. internationally renowned centres of experimental biology, the source of new data for bioinformatics research;
  2. expertise across a broad representation of the next generation research issues in bioinformatics (i.e. knowledge representation, knowledge driven inference, database federation and linkage studies). However these individual experts are widely distributed, not well known to each other and attempting to operate individually to satisfy the full range of academic, research and service demands. (If you will permit me a poor analogy: we seem to have a strong No Trump hand in Canadian bioinformatics. The question is can we organize the bidding process to best set up and then play the cards well enough to secure the contract!);
  3. access to the best broadband digital network in the world. CA*net is considered to be 5 to 10 years ahead of other countries in terms of connectivity, and is yet underutilized by Canadian researchers. CANARIE is preparing to move to CA*4net with funding expected in the 2002 federal budget. They are also committed to building the GRID concept from the applications (science) side, which is an opportunity for Genome Canada and Canadian scientists;
  4. the ad hoc NRC GRID committee in June 2001 identified bioinformatics as the most promising lead content for their continued development of GRID technology;
  5. access to individual clinical records of significant depth and duration in all provinces and territories each with unique identifiers. This clinical data is supported by a publicly funded health care system with strong public health components, which would support follow-up potential and longitudinal research designs;
  6. considerable experience with clinical record linkage studies. Federal, provincial and local interest and support for such studies (i.e. CIHI, CHII, the Office of Health Information Highway);
  7. an international reputation for stewardship and good government policy across a range of sensitive issues (i.e. privacy, data protection, community recruitment).

WORKSHOP PROCESS

One week prior to the workshop, participants were provided with a document, Toward an Integrated Vision for Bioinformatics in Canada, prepared by Lynn Curry of CurryCorp, which reviewed the above background and context.

After welcoming words from the two sponsors (M. Godbout for Genome Canada and R. McInnes from the CIHR Institute of Genetics), overviews were presented in the following areas:

  1. Steering Committee Perspectives: Peter Lewis
    Bioinformatics was defined and an overview of the process leading to the discussion document, as well as the objectives of the workshop were described.
  2. Bioinformatics in the Context of a Genome Sequence Centre: Marco Marra
    The UBC genome center , its organizational structure, mandate and activities were outlined.
  3. Canadian Bioinformatics Workshops (CBW): Francis Ouellette
    The origins and activities of the CBW were reviewed. Possible future directions for the CBW were outlined.
  4. Canadian Bioinformatics Resource (CBR): Simon Mercer
    The mandate, activites and services of the CBR were described. Plans for future expansion of the capabilities of the CBR as well as outreach were detailed.
  5. Blueprint: Chris Hogue
    A protein-protein interaction database was described in detail. The database will be a source of information to be mined by bioinformatic methods.
  6. Cybercell: Michael Ellison
    A project which will lead to the complete simulation of an E. coli cell was described.
  7. Genome Databases and Analysis Tools: Gertraud Burger
    Methods for the analysis of genome databases were outlined. The need for cooperation between computer scientists and biologists was emphasized
  8. MAGPIE: Christoph Sensen
    MAGPIE, a fully automated genome analysis annotation engine was described. It is used to analyze and annotate more than 50 publicly available genomes.

Workshop participants were then presented with three general challenges:

  1. To articulate issues and suggest strategy in bioinformatics service, support and training; and
  2. To suggest a range of 'big visions' for bioinformatics as a science in Canada.
  3. Considering the above, to suggest priorities for attention.

Participants worked through these considerations in a combination of small groups, to enhance in-depth conversation, and plenary sessions to share and compare ideas.

WORKSHOP RESULTS

The following summarizes the combined results of all the breakout groups and plenary discussions.

Service and Support

  • Participants believed that the Canadian bioinformatics community has no obligation to provide service but does have an obligation to support biology-led research.
  • Projects funded by Genome Canada, CIHR or any other funder should be provided with sufficient funds to purchase bioinformatics services and consulting support. The majority of monies to support service should go to new developments in bioinformatics. Funding agencies should also provide sufficient leadership to minimize service redundancies across funded projects.
  • Some 'application service provider' core service/support should be provided from a centralized source (regional, CBR, others) to support those with small labs and/ or insufficient funds to purchase bioinformatics service/consulting support on a fee-for-service basis. Service/support demands will change as new research tools become better understood and widely available. In this sense bioinformatics service/support is a commodity with a limited 'shelf-life' that must be constantly renewed to maintain utility and relevance. Regular investments must be made to keep any centralized service/ support centre relevant to the science it supports.
  • One mechanism to offer central access for support service would be to extend the CBR nation-wide. It is inexpensive and already has most of what it needs to be effective. There are also some commercial solutions (i.e. the Australian ANGIS model) that should be made freely available to everyone. To be efficient, any centralized access must coordinate people, hardware and services on a national basis. To be effective, any centralized service must be relevant to the scientific interpersonal requirements at each site.
  • Alternatively bioinformatics service/ support would be a reasonable expectation in research collaboration, in which case it is not useful to separate service from research. Computing science and mathematics must be better integrated into biological research. In order to attract the best bioinformaticians as research collaborators, we must recognize market forces and deal with the salary disparity between academic and private sectors. Partnership opportunities should be sought with vendors such as IBM, Silicon Graphics Inc. and Sun; service providers such as Biotools, Gene Ontology; and MDS; smaller companies, manual publishers, the European Bioinformatics Institute, and the National Centre for Bioinformatics Information.
  • To implement the service/ support on a fee-for-service basis, a program could be initiated with institutional support for individuals to be seconded to Genome Canada projects as consultants. Genome Canada funds could also be made available to employ bioinformaticians in specialists' labs. National oversight and coordination must be put in place to create and monitor mechanisms to improve efficiency/ effectiveness of bioinformatics service/ support provided through these or any other mechanisms.
  • Research collaborations could be fostered through travel support to visit other centres and funds to support continued interaction. Workshops should be held routinely to encourage such exchanges. A 'virtual bioinformatics institute' might also support growth of productive research collaboration (Ag-Food Canada has an experience-sharing model that may be a useful model). Coordinate summer students' training and organize co-op programs as well as bioinformatics 'boot camps' to allow scientists at all levels of training and experience to learn more about bioinformatics and to make connections between biologists and bioinformaticians.

Training/Education

  • The Canadian bioinformatics community does have an obligation to educate/train. To meet perceived needs about 200 people and 50 trainers need to be trained annually.
  • Bioinformatics is a new discipline that should be taught through existing mechanisms of university education. There should be explicit initiatives at institutions such as MSc/PhD interdisciplinary programs, career transition, education/ training for both tool users and tool builders and agents for change. Universities should work together to develop appropriate curricula instead of each one developing independently.
  • The CBW traveling road show is effective and should be supported, but is limited in scope. The CBW should have permanent sites in both eastern and western Canada. In order to increase its efficiency the CBW must 'train trainers'. The CBW training should seek full accreditation status to improve the credibility and currency value of its training programs and results.
  • There should be short-term specific training support within funded projects in order to create more technical staff. In addition, there should be some long-term academic training support possibilities that would be CIHR/ NSERC driven. This will need substantial investment in the basic research disciplines of computational biology, molecular biology, and genomics. A training program with scholarships and fellowships should be considered that would be funded by CIHR, NSERC, Genome Canada and others. A special independent committee using peer review criteria should make selection decisions for such a program.
  • The best aspects of existing training/education models (CBR, CBW, CSA) should be examined and retained. The NIH training grant model may also be useful to consider. In this model students rotate through different areas of the lab to gain wide exposure and contacts. The CIAR model for investigator support should also be reviewed.
  • NSERC and the private sector need to be included in the further development of training/ education plans for bioinformatics. Partnerships for education/ training should also involve universities, Blueprint, CIHR, IBM and other industrial partners.

Vision for Canadian Bioinformatics Science

An effective 'vision' must build on Canadian strengths and minimize weaknesses. Canadian strengths and opportunities relevant to the science of bioinformatics were perceived by the workshop participants to include the following points.

  • Bioinformatics is a new discipline, and new disciplines can define their own opportunities. CaNet3 (and soon to be implemented CaNet4) offers the capability to perform distributed processing and storage (GRID). We have the opportunity to correlate genome variation with function and phenotype because we already have lots of data in agriculture and in human health (health informatics, human clinical records, isolated populations, socialized health care) supported by the existence of clinical and diagnostic data in central repositories. Data integration will be more possible through creating common standards and formats across existing successful bioinformatic projects in Canada (i.e. BIND; CyberCell; Gene Ontologies and Wilkinson's for plants). These integrative efforts also allow development of productive private/ public interactions. We have expertise in creating, maintaining and integrating large databases (e.g. seismic databases, oil and gas sub-surface information). Funding from the Canada Foundation for Innovation (CFI) has made the necessary infrastructure widely available across Canada.
  • Areas of specific scientific strength in Canada, which should be actively leveraged into world-class results, were acknowledged as:

    algorithm design, artificial intelligence, biodiversity, biology, computer science, computational biology, data/ database integration, genetics, gene expression and interactions, high performance computing, high bandwidth networking, medical imaging, molecular evolution, protein engineering, proteomics, RNA folding, simulation and modeling, structural analysis and visualization.

Canadian Weaknesses and Threats

  • The bioinformatics community is sparse (insufficient critical mass). The brain drain phenomenon and retention problems threaten the already small community of Canadian bioinformaticians with a lack of manpower. The bioinformatics community is fragmented and adequate ongoing communication is not supported; resulting in insufficient networking to create and sustain world-class centres. Bioinformatics is currently perceived as a support service indicating a lack of recognition that bioinformatics is research. The lack of definition is a threat as this view makes bioinformatics interdisciplinary when in actuality, it is a scientific discipline on its own.
  • The bioinformatics industry is not strong and insufficiently supported. Not being prepared for the influx of data and not making data interpretable across formats/ platforms/ languages risks losing or wasting data. Proprietary research restricts the flow of information vital to grow the scientific basis of any discipline, including bioinformatics.
  • Bioinformatics lacks profile and understanding among funding bodies and researchers. There is a lack of bioinformatics research in institutions and a lack of coordination in what there is. This is partly due to the lack of large-scale integrative projects (i.e. no TIGR), the lack of peer-review in granting agency panels and the lack of viable funding mechanisms for basic bioinformatics research. Genome Canada's matching funding requirements constrains responsiveness to opportunities offered by that funding source and thereby restricts the growth of bioinformatics as a collaborating science in genomic biology. Matching fund requirements rewards the best deal makers not the best scientists, has steering effects on science direction and generally diminishes scientific morale.

The Next Breakthrough in Bioinformatics

The next "big thing" in bioinformatics will include the following: a database exchange and distributed annotation systems like ENSEMBL. How can we simulate life by the year 2020? Where do the parts come from, how do they fit together and how do they work? Omeganomics. Modeling feedback in-vitro and in-silico and the reverse. Combining genomics, chemistry, bioinformatics math and physics into systems biology.

Possible Canadian Niches

Plant bioinformatics; organism-specific bioinformatics; large database methods; visualization; human health care bioinformatics; comparative genomics; expression databases.

Suggested Priorities and Mechanisms

Funding agencies should coordinate activity and there should be bioinformaticians on peer review panels. Consider establishing a new panel at CIHR on genomics and bioinformatics that could fund both research and training. Adequately fund worthy programs. The matching money requirements should be dropped. Eliminate micro-management; there are too many conflicting criteria for funding, quarterly reporting and project management requirements. Build elements of trust at all levels. Better use of networking and communication to overcome barriers of geography and turf protection. Bioinformatics should be recognized as an emerging discipline. There is a need for an annual international caliber bioinformatics meeting to share ideas. Support regular 'integrative' or 'research agenda' workshops on the CIAR model.

NEXT STEPS

The discussion following the reports from the breakout groups revealed that there was enthusiasm for recognizing Bioinformatics as an emerging discipline, which needs support for its development. Several attendees suggested that the time might be right for the formation of a Canadian Bioinformatics Society with an annual scientific meeting, and the IG indicated that it would be willing to support such a meeting. A discussion about the support of large-scale projects revealed that this was not a high priority, but should definitely be available for deserving projects. One of the most useful concepts arising from the meeting was the need to promote and facilitate projects and training which lead to transdisciplinary initiatives between investigators in the "biology" and "informatics" communities.


Created: 2003-05-09
Modified: 2003-05-09
Print