A Framework for Selecting Large-scale, Distributed Data-intensive Software Connectors

A Framework for Selecting Large-scale, Distributed Data-intensive Software Connectors (Poster)
Student: Chris A. Mattmann, JPL/USC
Advisor: Nenad Medvidovic, USC/ISR
Collaborator: Daniel J. Crichton, JPL/USC

Abstract: Understanding the tradeoffs between large-scale, distributed data-intensive software connectors is a challenging task impacted by formidable software architectural issues. Several types of data-intensive software connectors exist including peer-to-peer networks, event-based systems, grid technologies and the traditional client server protocols. Each connector types claims to support large-scale data distribution and to be reliable, efficient and scalable; but the reality is that some connectors are extremely efficient, yet not very dependable, some are highly consistent and dependable, but neglect efficiency and scalability. With such parity, how can one make the appropriate connector selection for a given data distribution scenario, and be assured that the connector will be compatible with the existing system architecture that it will plug in to? Additionally is there any way that the connectors can be used together, carefully selecting desired properties from each, in order to satisfy a particular distribution scenario?

To address this issue, we present a framework for selecting data-intensive software connectors that allows a user to understand the key properties of data distribution as they relate to issues at the architectural level. Our framework includes a classification, categorization, integration and testing step. The classification step involves generating profiles of connectors and is based on eight key dimensions of data distribution gleaned from a thorough literature review and from our experience in the context of constructing large-scale distributed data systems at the Jet Propulsion Laboratory. The categorization step identifies candidate connectors that could be used together for a particular distribution scenario. The categorization step additionally detects architectural mismatches between the connectors, and allows for the inclusion of user preferences such as reliability (number of faults/second), scalability (data volume and number of hosts), and efficiency (memory footprint and transfer throughput). The integration step integrates the connectors using one (or more) of 4 types of connector integration. Finally, the testing step quantitatively evaluates 4 key performance properties of the data-intensive software connectors: scalability, reliability, efficiency and consistency, and tests the connector's satisfaction of a particular distribution scenario.

Bio: Chris Mattmann is a Ph.D. student in the Computer Science Department at the University of Southern California, where he is a member of the Software Architecture Research group, advised by Dr. Nenad Medvidovic. Chris recently passed his qualifying exam (January 2006). His dissertation research will investigate software connectors and their properties in highly distributed and voluminous data-intensive systems. The research area grew out of the growing need at NASA and other scientific research institutions and universities to understand the tradeoffs amongst available off-the-shelf classes of data movement technologies. In parallel to his research at USC, Chris works full time at the Jet Propulsion Laboratory, as a Software Engineer in the Modeling and Data Management Systems section, where he is managed by Daniel J. Crichton.

Navigation

Forum Reception generously supported by

Graduate Student Research Symposium generously supported by

ISR Sponsors

GSRS and Forum Sponsored by

A Framework for Selecting Large-scale, Distributed Data-intensive Software Connectors