Computational & Data Science, Infrastructure, & Interdisciplinary Research on University Campuses:
Experiences and Lessons from the Center for Computation & Technology
[This paper is the work of Daniel S. Katz (CCT Director of Cyberinfrastructure Development, 2006 to 2009) and Gabrielle Allen (CCT Assistant Director, 2003 to 2008); it does not reflect the views or opinions of the CCT or LSU.]
In recent years, numerous distinguished national panels (1) have critically examined modern developments in research and education and reached a similar conclusion: computational and data-enabled science, as the third pillar of research, standing equally alongside theory and experiment, will radically transform all areas of education, scholarly inquiry, industrial practice, as well as local and world economies. The panels also similarly concluded that to facilitate this transformation, profound changes must be made throughout government, academia, and industry. The remarks made in the 2005 Presidential Information Technology Advisory Committee (PITAC) report (2) are still relevant: “Universities...have not effectively recognized the strategic significance of computational science in either their organizational structures or their research and educational planning.” Computational initiatives associated with universities have taken various forms: supercomputing centers that provide national, statewide, or local computing facilities and encourage research involving computation; faculty hiring initiatives focused on initiating research programs to change the university's expertise and culture; establishment of academic research centers on campuses that include formal involvement of faculty, for example, through joint positions with departments; and multi-university or other partnerships where the university is represented by a single entity.
We believe that any academic institution wishing to advance computational and data science needs to first examine its status in three areas: cyberinfrastructure facilities, support for interdisciplinary research, and computational culture and expertise (Figure 1). Cyberinfrastructure facilities refers to the computational, storage, network, and visualization resources (local, national, and international) to which researchers have access; to the technical and professional support for these services; and to the connection of these services to desktop machines or experimental instruments in an end-to-end manner. Support for interdisciplinary research refers to the university's policies on joint appointments between units and associated promotion and tenure, policies and practices for university-wide curricula, and the academic appreciation of computational science that could rate, for example, software or data development in a similar manner to publications and citations. Finally, computational culture and expertise relates to the existence and prominence of faculty across a campus who develop or use computation as part of their research, and the provision of undergraduate and graduate courses that will train and educate students to work on research projects in the computational sciences.
Figure 1: Advancing a comprehensive computational science program requires coordinated initiatives in developing and supporting interdisciplinary research, enabling cyberinfrastructure, and underlying research and culture in computation.
Once the status of these areas has been reviewed, there are additional questions in designing a computational initiative. Should the cyberinfrastructure resources be state-of-the-art to enable leading edge research in computational science? Should faculty expertise in computational science be pervasive across all departments or concentrated in a few departments? Will the university administration back a long-term agenda in computational science and have the sustained desire to implement policies for changing culture? What is the timescale for change?
While there is some literature on issues relating to general interdisciplinary research (e.g., a National Academy review) (3), there is little written on the underlying visions, strategies, issues, practical implementations and best practices for computational initiatives. Further, what exists was usually written for a specific purpose, such as justifying an initiative for a state legislature, funding agency, or campus administration.
In April 2001, Louisiana Governor Foster asked the state Legislature to fund an Information Technology Initiative as a commitment to the 20-year Vision 2020 plan adopted in 2000 to grow and diversify the state's economy. The legislature authorized a permanent $25 million per year commitment, divided among the state's five research institutions. LSU created the Center for Applied Information Technology and Learning (LSU CAPITAL), targeting funds in education, research, and economic development, with the intent that this investment would result in the creation of new businesses, increased graduates in IT areas, and increased patents and licenses. Edward Seidel was recruited from the Max Planck Institute for Gravitational Physics (AEI) to formulate a vision and detailed plan (4) to structure LSU CAPITAL into a research center related to computation and informational technology, with a physical presence on the campus and a broad mission for interdisciplinary research at LSU and across the state. Seidel became director of LSU CAPITAL, reporting to the LSU vice chancellor of research and economic development. In October 2003, LSU CAPITAL was renamed the LSU Center for Computation & Technology, or CCT (http://www.cct.lsu.edu). LSU was lacking in all the three areas identified in Figure 1: cyberinfrastructure; support for interdisciplinary research and education; and computational research, which necessitated a three-pronged approach for the center's strategy (5,6).
To address LSU's cyberinfrastructure needs, CCT worked to develop campus and regional networks, connect to the national high-speed backbone, and build sustainable computational resources on the campus. (A negative side effect of including a focus on the provision of cyberinfrastructure resources is that some people tend to label the center as just an High Performance Computing (HPC) resource provider, rather than a research center; this proved to be an issue with how the center was represented and seen by the LSU administration.) CCT led an effort to propose a statewide high-speed network (called LONI) to connect state research institutions with multiple 10-Gbps optical lambdas. Louisiana Governor Blanco then mentioned LONI as a priority in her State of the State address. At this time, National LambdaRail (NLR) was emerging as a high-speed optical national backbone without a plan to connect to Louisiana. In 2004, Governor Blanco committed $40 million over 10 years to fund LONI, including purchasing and deploying initial computational resources at five sites and supporting technicians and staff, to advance research, education, and industry in the state. The state also funded a membership in NLR to connect the state to computational power available throughout the nation and the world.
When the CCT was formed, LSU had recently deployed what were then significant computational resources: 128-node and 512-node dual-processor clusters, managed by staff from the physics department, and a 46-node IBM Power2/Power3 machine managed by the university’s Information Technology Services (ITS). LSU created the HPC@LSU group, funded 50-50 by CCT and ITS to jointly manage these systems, which were the only major compute resources in Louisiana. HPC@LSU also began to manage the LONI compute systems, IBM Power5 clusters, and later, additional Dell systems for both LONI and LSU, including Queen Bee (the largest LONI system), as part of the TeraGrid, the US national HPC infrastructure.
CCT envisioned building a campus and national center for advancing computational sciences across all disciplines, with these groups' research activities integrated as closely as possible with the research computing environment. In this way, the services provided by the computing environment to the campus and nation would be the best possible, and the research output of the faculty, students, and staff would be advanced. CCT faculty would be able to lead nationally visible research activities, being able to carry out a research program that would not be otherwise possible, providing exemplars to the campus, catalyzing activity in computational science approaches to basic sciences, engineering, humanities, business, etc. This was a key component of the CCT vision, one that has been successful at other centers (e.g. NCSA, SDSC, AEI) around the world.
Initially, there were very few computationally oriented faculty in Louisiana, which hindered research in computational science, state collaborations, and LSU's involvement in national or international projects involving computation. To address this, CCT's core strategy has been to recruit computationally-oriented faculty to LSU, generally in joint 50-50 positions with departments, with tenure residing in the departments. This model has been discussed at length and has continuously been seen as the best model for strengthening departments in computational science, and encouraging real buy-in to the overall initiative from the departments. CCT also implements other strategies for associating faculty with the center, both for encouraging and supporting the participation of faculty already on the campus to take an active role in the center's programs and research, and for helping to attract and recruit faculty whose research interests overlap with CCT.
Research staff are also essential, making it possible to quickly bring in expertise in a particular computational area as a catalyst and tool for faculty recruitment, to form a bridge from center activities to the campus, to provide consistent support to strategically important areas, and to facilitate production level software development.
The fundamental group (in the CCT Core Computing Sciences Focus Area), centered around the Departments of Computer Science, Electrical and Computer Engineering, and Mathematics, was to have the necessary skills needed to build and sustain any program in computational science, including computational mathematics, scientific visualization, software toolkits, etc. Application groups were built to leverage strength on campus, hiring possibilities, and new opportunities.
In addition, CCT’s Cyberinfrastructure Development (CyD) division aimed to better integrate CCT’s research and HPC activities with the campus and national initiatives, with the mission to design, develop, and prototype cyberinfrastructure systems and software for current and future users of LSU's supercomputing systems, partnering where possible with the research groups at CCT to help professionalize prototype systems and support and expand their user base. CyD includes computational scientists, expected to cover 30-50% of their time on proposals led by scientists elsewhere at LSU or LONI, and to spend the rest of their time on computational science activities that lead to new funding or projects and internal support of HPC and LONI activities.
CCT’s education goal has been to cultivate the next generation of leaders in Louisiana’s knowledge-based economy, creating a highly skilled, diverse workforce. To reach this goal, objectives were set to assist in developing curricula and educational opportunities related to computation, to help hire faculty who would support an integrated effort to incorporate computation into the curricula, to offer programs that support activity in scientific computing, to attract and retain competitive students, and to advance opportunities for women and minorities in the STEM disciplines.
The final component of the triangle, interdisciplinary research, was supported by CCT’s organization and projects. CCT faculty are generally able to lead and take part in world-class interdisciplinary research groups related to computation, organized in focus areas: Core Computing Sciences, Coast to Cosmos, Material World, Cultural Computing, and System Science & Engineering. Each focus area has a faculty lead responsible for building cross-cutting interdisciplinary research programs, administration, coordinating the hiring of new faculty and staff, and organizing their unit. Interdisciplinary research is driven by activities in strategically motivated, large-scale projects in the focus areas, faculty research groups, and the Cyberinfrastructure Development division. These projects provide support (students, postdocs, and direction) to the Focus Areas as well as broad outreach for education and training across the state. In addition, CCT tried to engage senior administrators and use CCT faculty to drive curriculum change on the campus.
Two large projects begun in 2007 were the LONI Institute and Cybertools. The LONI Institute was a statewide multi-university collaboration, built on the success of the LONI university partnership, to coordinate the hiring of two faculty members at each university, in computer science, computational biology, and/or computational materials, and of one computational scientist at each university, to spur collaborative projects. Cybertools was another multi-university collaboration that used computational science projects across the state to drive developments in tools that could use the state’s computing and networking resources, which in turn could enable new computational science projects.
Particularly from the state legislature's point of view, CCT was intended to catalyze and support new economic development in the state. In fact, the initial metrics for success provided for LSU CAPITAL included the number of resulting new businesses and patents. Economic development needs to be carefully planned and is a long-term initiative, where success can be hard to measure, particularly in the short term. An example success, though not originally planned, was in September 2008, when Electronic Arts (EA) announced that they would place their North American quality assurance and testing center at LSU, creating 20 full-time jobs and 600 half-time jobs, with an annual payroll of $5.7 million throughout the next two years. EA noted that education and research efforts at LSU, including CCT research areas, were a strong factor in the company's decision to locate this center in Louisiana.
Recent Developments and Concluding Thoughts
In 2008, Seidel was recruited to the National Science Foundation, and LSU appointed an interim director and co-director and began a search for a new permanent director, which led to a director being appointed from inside the university for a three-year term. Starting in 2009, LSU has faced several significant and ongoing budget cuts that are currently impacting the CCT, particularly in its ability to recruit and retain faculty and staff.
The issues faced at LSU are similar to those at other institutions responding to the nation's call for an advancement of computation, computational science and interdisciplinary research. We believe it is important to carefully analyze the experiences of centers such as at LSU, as we have attempted to begin to do in this paper, in order to establish best practices for new initiatives or to lead to more fundamental change. From our experiences at CCT, we can highlight four key points that we feel are crucial for the success and sustainability of computational research centers such as CCT:
The three facets of computational science shown in Figure 1 have be taken seriously on the campus at the highest levels and seen as an important component of academic research.
HPC facilities on campuses need to be integrated with national resources and provide a pathway for campus research to easily connect to national and international activities.
Education and training of students and faculty is crucial; vast improvements are needed over the small numbers currently reached through HPC center tutorials; computation and computational thinking need to be part of new curricula across all disciplines.
Funding agencies should put more emphasis on broadening participation in computation, not just focusing on high end systems where decreasing numbers of researchers can join in, but making tools much more easily usable and intuitive and freeing all researchers from the limitations of their personal workstations, and providing access to simple tools for large scale parameter studies, data archiving, visualization and collaboration.
In addition, there are two points that we have learned specifically from the CCT experience:
- The overall vision of the university on topic X needs to be consistent across a broad spectrum of the university administration and faculty; it cannot be just one person’s vision, though it may start with one person.
- The funding needs to be stable over a number of years; activities need to be sustained to be successful, and this needs to be clear to the community from the beginning.