ATA Releases White Paper on Reliability of CSA Scores

12/9/2013

Compliance, Safety, Accountability (CSA) is FMCSA’s safety monitoring and measurement system used to identify unsafe carriers and prioritize them for future interventions (e.g., audits). The agency also encourages third parties to use CSA Safety Measurement System (SMS) scores as a tool for making safety-based business decisions.1 FMCSA hopes to leverage the power of the marketplace to make judgments about carriers and, as a result, compel them to improve their safety performance. SMS scores also have the potential to be used by plaintiffs’ attorneys and prosecutors in the context of post-crash litigation.

The use of SMS scores by third party stakeholders and its evaluation by judges raise obvious questions about the accuracy and reliability of the data. For stakeholders such as shippers and brokers the question is whether or not the scores can be routinely relied upon to make sound, beneficial judgments about the safety posture of individual carriers. Similarly, courts must be concerned with whether or not SMS data meet Federal and jurisdictional rules of evidence which require that the data be “trustworthy”2 and rest “on a reliable foundation.”3

Researchers have arrived at mixed conclusions with respect to the reliability of SMS scores in identifying unsafe (crash prone) motor carriers. Some found virtually no correlation between scores and crash rates in any of the measurement categories.4 However the American Transportation Research Institute (ATRI), using a better prediction model, found a positive relationship between scores and crash risk in three of the publicly available measurement categories (BASICs) but also found that scores in two others bear an inverse relationship to crash risk.5 Of the non-publicly available categories, scores in one (the Crash Indicator BASIC) likely correspond well to future crash involvement,6 but scores in the other (the HM Compliance BASIC) do not. ATRI also pointed out that the number of alerts that a carrier has been assigned is a strong indicator of crash risk.7 However, the strength of the relationship varies depending on the BASICs in which the carrier has alerts, since scores in some BASICs more strongly correlate with crash risk than those in others.

The relationship between scores and crash risk is impacted by a number of data and methodology problems that plague the system. These include: a substantial lack of data, particularly on small carriers who comprise the bulk of the industry; regional enforcement disparities; the questionable assignment of severity weights to individual violations; the underreporting of crashes by states; the inclusion of crashes that were not caused by motor carriers; and the increased exposure to crashes experienced by carriers operating in urban environments.

1 Carrier Safety Measurement System Methodology, Version 3.0, Revised December 2012, FMCSA, page 1-2.
2 Federal Rules of Evidence, Rule 803 (8) (B).
3 Daubert v. Merrell Dow Pharmaceuticals, 509 U.S. 579 (1993) United States Supreme Court.
4 Gallo, A. P. & Busche, M., CSA: Another Look with Similar Conclusions, Wells Fargo Securities Equity Research, July 12, 2012; Gimpel, J. Statistical Issues in the Safety Measurement and Inspection of Motor Carriers. Alliance for Safe, Efficient and Competitive Truck Transportation, July 10, 2012.
5 American Transportation Research Institute (ATRI), Compliance Safety Accountability: Analyzing the Relationship of Scores to Crash Risk, October 2012, page vii.
6 Note that crash involvement does not imply cause.
7 ATRI, page 30.
Though there are statistical correlations between SMS scores in certain categories and crash risk, as well as between the total number of alerts assigned and crash risk, individual carriers’ scores can be unreliable indicators of their safety performance. The identified correlations between scores and crash risk represent industry-wide trends that often don’t hold true for individual carriers. In most BASICs there are thousands of carriers (“exceptions”) whose scores contradict the trends (i.e. carriers with high scores but low crash rates and vice-versa). The sheer number of “exceptions” and the presence of numerous data and methodology problems lead to the conclusion that SMS scores alone as measures of individual carrier safety performance are, at a minimum, unreliable.