Network Analysis of Data Architectures Case Study By Les Morgan IMPACT, which stands for Improving Massachusetts Post-Acute Care Transfers, is an Office of the National Coordinator grant-funded project designed to improve care transitions using an enhanced electronic Universal Transfer Form (UTF) and electronic health information exchange. [For details see: http://mehi.masstech.org/what-we-do/hie/impact ] IMPACT will focus its efforts in Worcester County, a region where 85% of the healthcare for its 800,000 person population stays within the county. IMPACT will be able to analyze almost 100,000 patient transfers per year, as well as 20,000 Medicare Advantage patients whose claims data will be used for total cost of care analyses. A pre- and post-test model using claims and other metrics will be used to evaluate the success of the project. A feature of the IMPACT project is that at least some of the technology it is developing will be available as open source code. The project has developed a data model describing record structures needed to coordinate health care for patients across multiple providers. My special areas of interest are frail elders and people with complex and chronic medical conditions, often leading to death. To assess if the IMPACT data structures would be well-suited to carrying the type of data needed to improve care for those particularly vulnerable patients, I used network analysis methods for visualization. I chose Cytoscape to encode and visualize the IMPACT data structures as graphs. Cytoscape is an open source software platform for visualizing molecular interaction networks and biological pathways and integrating these networks with annotations, gene expression profiles and other state data. Although Cytoscape was originally designed for biological research, now it is a general platform for complex network analysis and visualization. In my case, I wanted to use Cytoscape for structural systems analysis of a software application. Thanks to the courtesy of the IMPACT project team, I was supplied with an Excel workbook documenting their data model as a set of rectangular “sheets”. An Excel data sheet is a two-dimensional rectangular matrix, and several sheets were used to document the record formats. Using manual methods, I transformed the rectangular matrices into network form, creating a Node Table and an Edge Table, the standard inputs for graph analysis. Nodes represent data elements. Edges represent hierarchical relationships between data elements within the electronic record format. There are 631 nodes and 633 edges in the resulting graph. After transposing the data elements to network form I then coded each data element based on my judgement of how important the data element would be in care coordination for frail elders and very sick people. I used color coding to represent different types of information in the record, with red representing a particularly important element for my purposes. I also coded node shapes, using a triangle for nodes that were of particular importance. The size of the node was adjusted to give visual emphasis to key items. Once these visual cues were added, the portions of the data structures that would be critical to care transitions became easy to see. IMPACT data elements Cytoscape has many types of data visualization that can be applied to networks. One of the simplest ways to get an overview of the entire graph is by using a force-directed clustering algorithm with a bias for circular arrangements. The first visualization shown here in the figure to the right is of that type. This type of clustering visualization preserves the hierarchical structure of the graph but relies on color, shape, and size, to draw attention to critical elements within clusters. The complex interconnections between elements located in different sections of the record structure creates a spiderweb of linkages, as shown in the second visualization in the figure below right.These are the same data elements, but arranged in one large circle. Items that are similar to one another based on the analyst’s coding are placed close together on the perimeter of the circle. Data elements that are in the same hierarchical sector of the record, but not similar in importance as coded by the analyst, are farther away, linked by a line representing a network edge crossing the circle. IMPACT Data Elements: Degree Sorted Circle Layout Many different algorithms can be applied to the data for rapid visual exploration of the record structures. If one visualization does not make much sense, another can be tested easily. Subsets of data elements matching any particular criteria can be isolated for detailed viewing. Cytoscape permits export of network visualizations in multiple formats. The two images shown here are in low-resolution .PNG format for general illustration purposes only. More detailed, zoomable visualizations can be created using .SVG format. My takeaways from this exercise 1. The IMPACT data structures contain most of the basic information needed to provide care coordination services for frail elderly clients and persons with very serious illnesses that are likely to lead to eventual death. Data elements of particular importance in such cases include assessment of activities of daily living, care plans, and advance directives. 2. Cytoscape is an effective tool for information systems analysis. The methods used here for data structure analysis could be applied to other aspects of systems design, including documentation of code structures. How do you use network analysis in healthcare? Do you have a pet project using graph analysis for healthcare data? Leave a comment and tell us about it!