Sources, Metrics & Methodology
How Pythia gets to the figures you useCarrier Benchmarking
Data Sources
Our database integrates multiple sources to provide a comprehensive view of carrier performance:
- NAIC Data: Carriers’ statutory filings
- PIPSO Data: State Exchanges, FAIR funds, Beach and Wind Plans
- Annual Statements: Workers’ Compensation State Funds (e.g. NYSIF)
Data Aggregation
To allow our users to access data at different levels of granularity, we aggregate data for products and geographies:
- Product-Level Aggregation: For example, figures for the line of business Marine/Aircraft Liability Total are the aggregation of statutory data for Inland Marine + Ocean Marine + Aircraft (NAIC lines)
- Geographic Aggregation: For example, figures for the Plains Region are the aggregation of statutory data for the States of IA, KS, MN, MO, NE, ND, SD
Figures allocations
- Full P&L (Combined Ratio) at State and Product Level: Pythia’s proprietary analysis allows our users to access Combined Ratio for carriers at the State and Product level. To do so, we’ve allocated figures reported at the national level (e.g., ULAE) down to the State level using a proprietary methodology.
- Fixed and contingent commissions: Total commissions reported in statutory filings at the State and Product level are separated into Fixed Commissions and Contingent Commissions using a proprietary methodology. Such figures are only reported at the National level separately in statutory filings.
Carrier Classification and Industry Segmentation
- Channel and Customer Size Classification: Pythia’s team has dedicated several months of research across SERFF filings, statutory filings, press reports, earnings calls and other sources to classify the 4,000+ carriers into channels and customer segments.
- Example: Berkshire’s “GEICO Texas County Mutual” classified as “Direct channel” and its Commercial Auto book as “Small Commercial” (accounts under $15M revenue).
- Updates: This is an ongoing process which allows us to reflect changes in carrier strategy and distribution in our classification (e.g., most recently Farmers Insurance switched several entities from Tied to Independent Agency).
Industry Spotlight
Data Sources
To provide a concrete and reliable perspective on premiums at the intersection of geography, industry, size segment, Admitted/E&S, we leverage 10-15 critical data sources, including:
- NAIC: Premiums by product and state (Admitted vs. E&S)
- BLS (Bureau of Labor Statistics): Number of businesses, FTEs, wages
- BEA (Bureau of Economic Analysis): GDP, Gross Output, revenue metrics
- Census: Housing units, population
- FAA: Aircraft count
- State DMVs: Personal and commercial vehicle data
- FEMA: Housing unit values, risk ratings at the ZIP code level
Premium Estimation Model
We leverage a proprietary models to estimate premiums at the industry-city-size segment level:
- Example: Workers’ Compensation Premiums
- Considerations:
- # of businesses and FTEs by industry
- Business size segmentation (larger businesses have lower Workers’ Comp spend per FTE)
- Fatal occupational injuries by industry (riskier industries → higher premiums)
- FEMA risk scores (higher-risk areas → higher premiums)
- Considerations:
- Example: Aircraft Insurance Premiums
- Considerations:
- # on aircrafts on the FAA aircraft registry by zip code
- Industry (NAICS) classification of LLC owners of aircrafts
- Frequency of usage and risk level for certain industries (e.g., agriculture)
- Engine size
- Considerations:
Insurance Rates
- Our rates data leverage 2 key sources
- SERFF filings for 50+ carriers
- CIAB quarterly rate reporting, analyzed with a proprietary methodology
Catastrophe Risk data
- Pythia’s proprietary methodology allows our users to access clean data for (1) Incurred cat losses at the State level and (2) Insured part of incurred cat losses at the State level
- Base data: NOAA’s data on incurred catastrophe losses for each catastrophe. Such data is not allocated to States, leading to double-counting when viewed State-by-State
- State-Level Allocation: We merge NOAA data with NAIC data and months-long Pythia research to estimate the incurred part of losses at the State level and to build a concrete perspective on the insured part of incurred losses
Premiums and rate forecasting
Pythia’s premiums and rates forecasting is built on the basis of NAIC, SERFF and CIAB data leveraging 85+ indicators from BLS, BEA, Census, NOAA and other sources with a proprietary ML-driven methodology.