R Hackathon 1

From NESCent Informatics wiki
Jump to: navigation, search
NESCent Hackathon on Comparative Methods in R

Synopsis

NESCent sponsored a hackathon (Dec. 10-14, 2007) focused on integration of comparative methods within the R statistical package to promote interoperability, the support of data exchange standards, and greater usability of tools and methods in evolutionary bioinformatics.

A citable poster describing the hackathon and its outcomes is available at Nature Precedings. Also, the history of the hackathon and the various results that arose from it over time have recently been written up in a preprint submitted to a conference: Cranston, Karen, Todd Vision, Brian O’Meara, and Hilmar Lapp. 2013. “A Grassroots Approach to Software Sustainability.” doi:10.6084/m9.figshare.790739.

Motivation

Comparative phylogenetic methods provide a rich and powerful way to understand the evolution of organismal traits. A wide variety of statistical methods and tools have been developed to rigorously test hypotheses about rates and modes of trait evolution, trait covariation, correlation of traits with ecological and environmental factors, host/parasite co-evolution, etc. The R statistical analysis package has emerged as a popular platform for implementation of these methods.

The many individual software development efforts in R and the growing number of users presented an opportunity to address the common challenges of data exchange, interoperability, and usability. NESCent took advantage of this opportunity by sponsoring a hackathon, or codefest, an event at which programmers who otherwise do not have the opportunity to interact on a routine basis meet to collaboratively develop working code that furthers the goals of the larger open development community to which they belong. The hackathon brought together different people and groups who had started to develop comparative phylogenetic methods in the R platform, or who would had wanted to integrate their methods into, or interface a tool with the R platform.

Specific objectives

The following broad objectives were defined. Participants then helped to identify specific objectives and coding targets prior to and during the event.

  1. Ensuring compatibility and data flow between R packages, for example by agreeing on and implementing one or more common data models.
  2. Improving support for the input and output of data exchange standards and formats.
  3. Developing approaches to enable code re-usability and extensibility.
  4. Identifying new functionality that is well-suited to native integration into the R platform, and prototyping the integration of select targets.
  5. Identifying external software well-suited to interface with the R platform, and prototyping the interface for select targets. For instance, providing support for analysis of large data sets through an interface to existing C/C++ programs, or to broader capabilities through an interface to general-purpose phylogenetic analysis packages such as Mesquite.
  6. Developing end-user documentation.
  7. Identifying future research areas and the initiative of collaborations between different groups. See overview of packages, including future development.
  8. Training of future developers and broadening the diversity of the software development community.

We have a more detailed page of goals from user and programming perspectives.

The hackathon concentrated on writing code. All code and documentation is made available immediately and freely to the community under an open-source (OSI-approved) license.

Subgroups

From the objectives list the following 7 subgroups were formed.

  1. Diversification SG
  2. Divergence Time Estimation SG
  3. Documentation SG
  4. Trait Evolution SG
  5. Class Design SG
  6. Mesquite-R communication SG
  7. Input-Output SG

Participants

Participation was arranged by invitation and by self-nomination followed by review.

  • The list of participants is on-line.
  • We would also like to thank the following people who contributed in the planning stages of the hackathon.

Organization

Organizing Committee

Schedule

  • The hackathon took place from Dec 10-14, 2007 at NESCent.
    • For more detailed information about the schedule click here.
    • For notes from the final Discussion click here

Organizational Activities

Detailed planning steps are outlined and documented separately. In particular, the following activities took place: