The Promise of Portability
Modernizing the AIHA Exposure Monitoring Data Structure
Working from Home but Missing Your Synergist? Update Your Address
If you’ve been working from home, please consider updating your address with AIHA. You can change your address by editing your profile through AIHA.org. To ensure uninterrupted delivery of The Synergist, designate your home address as “preferred” on your profile. Update your address now.
Industrial hygiene exposure monitoring data are notoriously complex. Beyond the target analyte, there are dozens of potential variables related to the monitoring process and the context of exposure. Given the complexity and what’s at stake, high-quality data are vital for making informed decisions. Even so, the costs of collecting data are often high enough to require justification. Even with sufficient resources, we may find ourselves with insufficient statistical certainty that our exposures are acceptable.
Cost is perhaps the single biggest impediment to collecting exposure monitoring data: monitoring and evaluating exposures requires time, resources, and cooperation. Even without monitoring, significant investment is needed to compile or investigate the details needed to model an exposure, including information required for deterministic models.
Combined, these factors related to the cost, time, and skills necessary for data collection and analyses make the sharing of exposure monitoring data compelling. Contributors, analysts, and public health entities all stand to gain substantial insight by compiling and using crowdsourced data for the prevention of occupational illness and injury. This means more workers attending birthday parties for their grandchildren, experiencing less medical burden, and living longer and happier lives.
As explained in the December 2000 Synergist article “Industrial Hygiene Data Standardization: Past Lessons, Present Challenges, and Future Directions,” barriers to sharing exposure monitoring data include the difficulty of technical implementation, logistics, and concerns about anonymization. See Table 1 for a list of barriers and possible solutions.
Table 1. Selected Barriers to Universal Sharing of Exposure Monitoring Data
Click or tap on the table to open a larger version in your browser.
Crowdsourcing has already been used to compile exposure monitoring data in a few instances. OSHA, NIOSH, and MSHA already maintain databases compiled from work governed by their regulatory fiat. The University of Michigan has constructed a job exposure matrix comprising nearly a million noise dosimetry measurements. France, Germany, and Canada have national exposure monitoring databases for varying purposes. Industry-specific shared databases are also used within trade organizations to memorialize exposures and implement anticipatory controls. (These databases can sometimes also be used to meet regulatory requirements: OSHA, in regulations such as its hexavalent chromium standard at 29 CFR 1910.1026(a)(4), allows for the use of “objective data” to demonstrate compliance.) Simply put, many organizations have determined that the benefits of data standardization and sharing outweigh the potential costs. But expanding participation in shared exposure monitoring databases requires sharing to be easier and more beneficial to participants. UPDATING DATA STANDARDIZATION This summer, AIHA will publish a white paper that presents best practices for IH data standardization specific to air and noise monitoring. The paper updates the previous AIHA/ACGIH data structure proposed in 1996. Adoption of this structure was limited in part by the large number of attributes required. The updated standardization approach described in the forthcoming white paper focuses on 16 key attributes—a 90 percent reduction from the 1996 effort—and five to six additional attributes for specific monitoring data types. The goal of this simplification is to maintain focus on the most important data and minimize the effort necessary to create sharable records while maximizing participation from OEHS professionals by providing value in compatibility. The updated approach also seeks to take advantage of the substantial improvement in technology and software that has occurred in the intervening three decades.
Figure 1. Records and attributes in a tabular database. A forthcoming AIHA white paper describes which attributes are essential for data portability.
Click or tap on the figure to open a larger version in your browser.
Selecting these essential attributes required the authors of the white paper to engage with many experienced professionals. Ultimately, the decision to include an attribute for the purpose of portability or shareability was informed by three key factors: 1. Essentiality: the number of attributes must be as small as possible to limit complexity and support adoption while still being useful for a complete exposure assessment methodology. 2. Integrity: key attributes related to reliability or interpretability must be included to prevent systematic biases and to ensure essential context is not lost. 3. Compatibility: attributes should be as universal among agents as possible to account for diverse use cases while still ensuring specific records are intelligible and transmissible. The team of authors curated such a list by compiling all possible attributes and determining which ones were indispensable. Attributes considered necessary included units of measurement and fractions for agent concentrations (for example, inhalable, respirable, total, thoracic, and so on), which are not standardized or specified in historical databases, leading to loss of integrity. Some attributes deemed nonessential were values that can be calculated, such as projected time weighted average (TWA), which can be determined from other noise parameters. Striking a balance between comprehensiveness and utility was the most difficult part of this process. The longer the list of necessary data attributes, the more labor is required to collect data in the field, potentially limiting adoption. While many IH monitoring programs use complex or customized software, many organizations still rely on paper forms or tabular data formats similar to Excel spreadsheets. For optimal impact, this standardization effort must serve the current needs of individual users while accounting for future needs, technological advancements, and the greater good. Although the white paper defines a limited number of essential attributes, organizations can and often should include additional attributes that are useful for internal processes but inappropriate for sharing. Examples of these attributes include department or building identifiers, which should be removed for anonymity prior to export. THE VALUE OF PORTABLE STANDARDIZED DATA The ability to share data requires adoption of a mutually intelligible structure. But to encourage widespread sharing, the structure must provide immediate value for the organizations that adopt it. Implementing the proposed structure for prospective data collection, as well as retrospective data recapture or translation, may have some of the following benefits: Improving data quality and reliability. Utilizing the structure for exposure monitoring efforts or to specify data collected during contracted professional services ensures the minimization of data gaps that could hamper future analysis. Emerging data evaluation needs can be met quickly, and data transformations, what-if analyses (such as for shift length changes), and statistically-informed action levels can be readily reviewed. To guide software development, organizations can also request a standardized structure and supporting features from providers of data systems. Hardening institutional memory. Considering the limited number of highly qualified industrial hygiene practitioners, trends in the OEHS job market, and the competitiveness of service providers, maintaining the compatibility and accessibility of data from multiple sources is key to meeting stakeholder commitment. Differences in data structure between facilities or deficits in data recording can fracture data segments and impede the enterprise risk management approach to continuous quality improvement via exposure reduction. Justifying best practices. Critical changes to exposure control can require adoption of new systems, processes, or equipment. These acquisitions will need to be justified, especially if they increase costs. Standardized data allow for simpler comparison of exposures before and after implementing these controls, among sites with different control status or strategies, or for planning controls for processes in development. Assisting claims management. With changing societal expectations regarding workplace responsibility for occupational exposures, exposure-related claims management requires careful stewardship of data that can potentially be used in defense of organizations. Conversely, standardization also allows for simpler predictive analysis of potential claims liability. Building datasets to train machine learning algorithms. With the advent of machine learning (ML) and artificial intelligence (AI), identifying properly curated datasets for training is crucial to advanced tooling. Data standardization allows organizations to develop predictive and analytical models to harness the most advanced ML and AI products. The value of qualifying these datasets—that is, ensuring their validity—can easily exceed the costs of changing data formats. Data standardization can also help individual organizations improve their exposure monitoring strategy. The benefits of standardization increase when data portability is used to maximum advantage. A portable data structure may result in the following key benefits: Simplifying benchmarking. Sharing and comparing anonymized data in peer-to-peer relationships or in an industry-specific context allows for benchmarking between similar organizations. Benchmarking can be used to justify expenditures for overcoming infrastructure debt, implementing new processes or controls, and investing in exposure control projects. With more organizations taking a nonproprietary approach to worker protection, quantitative comparison can show workers and leaders that core commitments are being met. Organizations can also evaluate their relative positioning within an industry through comparisons of qualitative, blinded, or anonymized data. Enabling predictive exposure analytics (PEA). With the constant advance of data-driven industry and automation, using data standardization for real-time detection systems (RTDS) allows generation of a training dataset for exposure monitoring. When combined with process information like production rate, feedstock characteristics, or ventilation variables, a properly qualified AI can provide insight on exposure trends that can anticipate the need for additional proactive exposure controls. Empowering research and contextualizing feasibility assessment. Researchers already use public exposure monitoring databases to inform analysis and direct research efforts. Anonymously sharing exposure monitoring data collected under a standardized data structure expands an organization’s perspective and hones the development of controls and feasibility assessment. Production of exposure monitoring data that can be easily processed and integrated allows organizations or trade groups to better represent exposures they would not otherwise consider. Improving generative AI models. Expanding these models to include crowdsourced exposure monitoring data can enhance the usability of AI and broaden its applications. As presenter Dave Risi observed at AIHce EXP 2023 in Phoenix, OEHS professionals have only begun to realize the usefulness of AI combined with advanced applications for natural language processing. INCLUDING REAL-TIME DETECTION SYSTEMS In the future, the use of RTDS for industrial hygiene will increase for many hazards, including air and noise. In terms of the “5 Vs” of Big Data, greater RTDS use will lead to increased data volume, velocity of data generation, variety, and additional value, as well as a greater focus on the veracity (that is, quality) of the data. As the Internet of Things (IoT) revolution continues to tie RTDS to cloud databases, capturing data streams and transforming them into reliable exposure monitoring conclusions will become more important. Data portability achieved through standardization simplifies the integration of diverse sensors into a unified strategy of capture, analysis, and recordkeeping. In other industries, data transfer standards like ISA100, a wireless network technology standard developed by the International Society of Automation, have revolutionized the interoperability of instrumentation and data systems. A portable OEHS data structure will allow manufacturers, programmers, and integrators of RTDS to collaborate in pursuit of intelligibility between instruments and systems. Techniques for transforming continuous monitoring and direct-reading instrument data logs into discrete exposure monitoring conclusions are under discussion. Including RTDS in databases may require organizations to take positions on how to identify exposure monitoring aliquots, especially when these determinations extend beyond clearly identified shifts, as may be required for toxicants with chronic or body-burden health effects. IMPLEMENTING STANDARDIZATION To realize the power of portable data, the first step is adopting the standardized data structure as an aspirational goal and determining data gaps between current and future practice. The data structure proposed in the AIHA white paper specifies the key attributes, but additional attributes may be required to meet stakeholders’ needs. Closing these gaps will help identify which data are necessary for general practice and which for other needs. Adopting a standardized data structure will require organizations to determine which additional attributes may be needed for internal use. For example, external users aren’t likely to care that specific exposure monitoring records are from a company’s production operations department, but this information could be useful to internal stakeholders. Determining these additional attributes is an organizational decision. Once a structure is adopted, the next step is to determine the cyclical workflow that will allow collection and specification of the essential attributes and any additional attributes to be implemented. Traditional exposure assessments may be simple to integrate into the new process, but other monitoring strategies may need to be adapted to take full advantage of the new structure. Practitioners may wish to focus on the immediate and near-term benefits for their own organization, with added value coming from participation in a larger community of practice. Ultimately, implementation of data standardization can vastly improve the data quality and usability within organizations and move the OEHS profession closer to data shareability across the IH field. SPENCER PIZZANI, CIH, is the occupational health manager for PepsiCo Global EHS. TAYLOR M. SHOCKEY, PhD, MPH, is a research health scientist at CDC/NIOSH in Cincinnati, Ohio. Send feedback to The Synergist. Jirsak/Getty Images
AIHA: “How AI, ML, and Big Data can Apply to the Profession,” AIHce EXP 2023 (presentation by Dave Risi, May 2023).
AIHA: “Making Accurate Exposure Risk Decisions” (2023).
Applied Occupational and Environmental Hygiene: “Special Report: Data Elements for Occupational Exposure Databases: Guidelines and Recommendations for Airborne Hazards and Noise” (November 1996).
The Synergist: “Industrial Hygiene Data Standardization: Past Lessons, Present Challenges, and Future Directions” (December 2020).