San Diego County: Analysis of Influenza Datasets¶

This report contains analysis (and related commentary) of data collected on the influenza virus, commonly referred to as the Flu virus, in the San Diego county. California state law requires reporting of influenza outbreaks, including deaths of individuals between the ages of 0 and 64 years resulting from the same. The analysis aims to answer questions under Data Questions below. Results from the analysis can be found under Insights.

Data Source¶

All of the data used in this analysis is sourced from the open data portal of SD HHSA (San Diego County Health & Human Services Agency) and can be downloaded from this link.

The codefile can be found here. Note that the codefile covers all datasets from the SD HHSA and is not specific to the flu dataset.

Data Questions¶

These questions are intended to understand the impact of flu virus over time, across age-groups and across communities.

What are the flu trends year-over-year?
Are different communities impacted differently by the flu?
Which age-groups are the most vulnerable and have flu trends changed for these groups?

Data Analysis¶

Below is the full set of transformations carried out on the dataset, as well as exploratory visualizations on the same to answer the data questions.

import sys
import os
import pandas as pd 
import numpy as np 
import pprint
from itertools import product, groupby

# plotting and display related
import matplotlib.pyplot as plt
from matplotlib import pylab, mlab, gridspec, cm
from IPython.core.pylabtools import figsize, getfigs
from IPython.display import display, HTML
from pylab import *

# custom utilities
import utils.pyutils as pyt

%matplotlib inline

CWD = os.getcwd()

DATADIR = 'data'
DATAFILE = "Flu_Public_Health_Data.csv"

Exploring the Data¶

We first inspect the fields in the data to decide which ones we'd need for the analysis.

df = pd.read_csv(os.path.join(DATADIR,DATAFILE),skipinitialspace=True)

%pprint
df.columns.values.tolist()

Pretty printing has been turned OFF

['CONDITION', 'OUTCOME', 'Year', 'Geography', 'GeoType', 'GeoName', 'GeoID', 'Region', 'SES', 'District', 'Total', 'TotalRate', 'AARate', 'Age0_14', 'Age0_14Rate', 'Age15_24', 'Age15_24Rate', 'Age25_44', 'Age25_44Rate', 'Age45_64', 'Age45_64Rate', 'Age65Plus', 'Age65PlusRate', 'Total_Male', 'Total_MaleRate', 'AA_TotalMaleRate', 'Age0_14_Male', 'Age0_14_MaleRate', 'Age15_24_Male', 'Age15_24_MaleRate', 'Age25_44_Male', 'Age25_44_MaleRate', 'Age45_64_Male', 'Age45_64_MaleRate', 'Age65Plus_Male', 'Age65Plus_MaleRate', 'Total_Female', 'Total_FemaleRate', 'AA_TotalFemaleRate', 'Age0_14_Female', 'Age0_14_FemaleRate', 'Age15_24_Female', 'Age15_24_FemaleRate', 'Age25_44_Female', 'Age25_44_FemaleRate', 'Age45_64_Female', 'Age45_64_FemaleRate', 'Age45Plus_Female', 'Age45Plus_FemaleRate', 'Age65Plus_Female', 'Age65Plus_FemaleRate', 'White_Total', 'White_TotalRate', 'White_Age0_14', 'White_Age0_14Rate', 'White_Age15_24', 'White_Age15_24Rate', 'White_Age25_44', 'White_Age25_44Rate', 'White_Age45_64', 'White_Age45_64Rate', 'White_Age65Plus', 'White_Age65PlusRate', 'White_Total_Male', 'White_Total_MaleRate', 'White_Age0_14_Male', 'White_Age0_14_MaleRate', 'White_Age15_24_Male', 'White_Age15_24_MaleRate', 'White_Age25_44_Male', 'White_Age25_44_MaleRate', 'White_Age45_64_Male', 'White_Age45_64_MaleRate', 'White_Age65Plus_Male', 'White_Age65Plus_MaleRate', 'White_Total_Female', 'White_Total_FemaleRate', 'White_Age0_14_Female', 'White_Age0_14_FemaleRate', 'White_Age15_24_Female', 'White_Age15_24_FemaleRate', 'White_Age25_44_Female', 'White_Age25_44_FemaleRate', 'White_Age45_64_Female', 'White_Age45_64_FemaleRate', 'White_Age45Plus_Female', 'White_Age45Plus_FemaleRate', 'White_Age65Plus_Female', 'White_Age65Plus_FemaleRate', 'Black_Total', 'Black_TotalRate', 'Black_Age0_14', 'Black_Age0_14Rate', 'Black_Age15_24', 'Black_Age15_24Rate', 'Black_Age25_44', 'Black_Age25_44Rate', 'Black_Age45_64', 'Black_Age45_64Rate', 'Black_Age65Plus', 'Black_Age65PlusRate', 'Black_Total_Male', 'Black_Total_MaleRate', 'Black_Age0_14_Male', 'Black_Age0_14_MaleRate', 'Black_Age15_24_Male', 'Black_Age15_24_MaleRate', 'Black_Age25_44_Male', 'Black_Age25_44_MaleRate', 'Black_Age45_64_Male', 'Black_Age45_64_MaleRate', 'Black_Age65Plus_Male', 'Black_Age65Plus_MaleRate', 'Black_Total_Female', 'Black_Total_FemaleRate', 'Black_Age0_14_Female', 'Black_Age0_14_FemaleRate', 'Black_Age15_24_Female', 'Black_Age15_24_FemaleRate', 'Black_Age25_44_Female']

%pprint

Pretty printing has been turned ON

From a look at the fields that are part of this data we can see that it contains breakdowns of the statistics by ethnicity and gender. For our analysis we are not concerned with this breakdown and can drop the related fields.

df = df.iloc[:,:23]
df.head(n=3)

Let's examine the contents of a few other key fields to get a better sense of what this data contains.

display(HTML("<b>Years</b>: {}".format(list(df.Year.unique()))))
display(HTML("<b>Outcome</b>: {}".format(list(df.OUTCOME.unique()))))
display(HTML("<b>Region</b>: {}".format(list(df.Region.unique()))))
display(HTML("<b>SES</b>: {}".format(list(df.SES.unique()))))

The data contains records for only the last 5 years. This is not very helpful when trying to understand long-term trends but nevertheless these records are contiguous and will help understand some short term ones. More importantly, data is available not only for all the major areas in the county but for both fatalities and non-fatal outcomes resulting from the outbreak.

That said, the percentage of the population afflicted by the flu is likely significantly larger than the aggregate of the outcomes recorded in the data since most people either self-medicate or are handled by non-emergency facilities (including their local physician). One should however expect trends in the larger population to be reflected even in this smaller subset comprising of deaths, hospitalizations and Emergency Dept (ED) discharges.

display(HTML("<b>Geo Type</b>: {}".format(list(df.GeoType.unique()))))

Geographies¶

From the GeoType field values listed above one can see that the data is recorded for many different geographies for any given year. From the code book we learn that there are 41 Sub-Regional Area (SRAs), 18 city municipalities, 6 HHSA Service Regions and 5 Supervisory districts. The numbers are reported on the basis of each of these geographies and more so since any one of these geographies cannot easily be summed up in terms of the other. E.g.: An SRA can cut across two regions, as also across municipal boundaries.

Essentially, data must eventually be filtered by the geography type in order to be further analyzed. For this analysis we will opt for data recorded on a city municipality basis. Although, it must be noted that the SRA is the geography of lowest granularity in the data. It is typically good to retain information at the lowest granularity and aggregate them into larger geographies. Further, SRAs have additional information in the form of SES (Socio-Economic Sector) within the data. This information is lost when using larger geographies such as munipalities.

df = df[df.GeoType == "Municipal"]
df.shape

(242, 23)

pyt.summarize(df)

Dimensions: (242, 23)
CONDITION                     (<U3 null: 0 len: 242 unq: 1, [Flu])
OUTCOME          (<U15 null: 0 len: 242 unq: 3, [Hospitalizatio...
Year             (int32 null: 0 len: 242 unq: 5, [2012, 2015, 2...
Geography        (<U22 null: 0 len: 242 unq: 19, [City of Sante...
GeoType                 (<U9 null: 0 len: 242 unq: 1, [Municipal])
GeoName          (<U14 null: 0 len: 242 unq: 19, [SANTEE, CARLS...
GeoID            (int32 null: 0 len: 242 unq: 19, [16, 1, 2, 5,...
Region                  (float64 null: 242 len: 242 unq: 1, [nan])
SES                     (float64 null: 242 len: 242 unq: 1, [nan])
District                (float64 null: 242 len: 242 unq: 1, [nan])
Total            (float64 null: 78 len: 242 unq: 98, [nan, 5.0,...
TotalRate        (float64 null: 78 len: 242 unq: 164, [nan, 1.9...
AARate           (float64 null: 78 len: 242 unq: 163, [nan, 2.1...
Age0_14          (float64 null: 146 len: 242 unq: 57, [nan, 25....
Age0_14Rate      (float64 null: 146 len: 242 unq: 97, [nan, 133...
Age15_24         (float64 null: 169 len: 242 unq: 40, [nan, 12....
Age15_24Rate     (float64 null: 169 len: 242 unq: 74, [nan, 82....
Age25_44         (float64 null: 147 len: 242 unq: 49, [nan, 5.0...
Age25_44Rate     (float64 null: 147 len: 242 unq: 96, [nan, 138...
Age45_64         (float64 null: 150 len: 242 unq: 41, [nan, 8.0...
Age45_64Rate     (float64 null: 150 len: 242 unq: 91, [nan, 2.5...
Age65Plus        (float64 null: 134 len: 242 unq: 43, [nan, 5.0...
Age65PlusRate    (float64 null: 134 len: 242 unq: 108, [nan, 18...
dtype: object

Missing Data¶

From the summary of the remaining data we see that a number of fields contain NaN's (represented by the null count in the summary). We purge fields (i.e.: columns) for which all values are NULL while continuing to retain rows for which all key indicators are NULL. Such rows are place holders for regions which reported no data with the specified outcome.

# drop cols with NAs for ALL rows
df = df.dropna(how='all',axis=1)
df.shape

# drop rows with NAs for ALL of the key indicators
#df = df.dropna(subset=['Total','Age0_14','Age15_24','Age25_44','Age45_64','Age65Plus'],how='all')
#df = df.fillna(0)

(242, 20)

Analysis: Q1, Q3¶

Note that in the table below we aggregrate numbers by year and outcome. On close inspection one can see that the Total does not add up to the numbers under the various age groups. This is likely due to the fact that numbers less than 5 are not identified on an age group basis even though they show up in the total. This has to do with compliance requirements of healthcare open data owing to privacy issues.

pd.options.display.float_format = '{:,.0f}'.format

# group by Year
df_yr = df.groupby(['Year','OUTCOME'])

cols = ['Total','Age0_14','Age15_24','Age25_44','Age45_64','Age65Plus']

df_yr_totals = df_yr[cols].sum().fillna(0)
df_yr_totals

#plt.style.use = 'default'

matplotlib.rcParams['font.size'] = 10
matplotlib.rcParams['font.weight'] = 'bold'
matplotlib.rcParams['font.family'] = 'sans-serif'
matplotlib.rcParams['font.sans-serif'] = ['Tahoma']

# just plot the totals for each group
df_plot = df_yr_totals.unstack()['Total']

plt.plot(df_plot,marker='o')
plt.legend(df_plot.columns, prop={'size': 12, 'weight': 'regular'})

fig = plt.gcf()
fig.set_size_inches(12, 6)

ax = plt.gca()

ax.grid(False)
ax.set_facecolor('white')
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)

ax.xaxis.set_major_locator(MaxNLocator(integer=True))

title = "San Diego Cities: Flu Trends (2011-2015), Aggregates"
ax.set_title(title,fontsize=14, fontweight='bold',color='black')
ax.title.set_position([.5,1.05])

plt.show()

pd.options.display.float_format = '{:,.2f}'.format

def hl_max_outcome(s):
    '''
    highlight the maximum in a Series based on outcome.
    '''
    ed = s.xs('ED Discharge',level='OUTCOME')
    d = s.xs('Death',level='OUTCOME')
    h = s.xs('Hospitalization',level='OUTCOME')
    
    #is_max = ((s == ed.max()) | (s == d.max()) | (s == h.max())) & (s != 0)
    #return ['background-color: yellow' if v else '' for v in is_max]
    
    color = []
    for v in s:
        
        if v == 0:
            color.append('')
            continue
        
        if v == d.max():
            color.append('background-color: lightskyblue')
        elif v == h.max():
            color.append('background-color: lightgreen')
        elif v == ed.max():
            color.append('background-color: sandybrown')
        else:
            color.append('')
            
    return color

cols = ['TotalRate','Age0_14Rate','Age15_24Rate','Age25_44Rate','Age45_64Rate','Age65PlusRate']

df_yr_rates = df_yr[cols].agg([np.mean,np.max,np.min]).fillna(0)
df_yr_rates_hl = df_yr_rates.style.apply(hl_max_outcome)

df_yr_rates_hl

In the above table, the mean, max and min rates (among all cities that reported data) for each outcome (per population of 100,000) can be seen for each of the five years in the dataset. Years 2013 and 2015 standout due to the high rates of death, hospitalization and ED discharges respectively.

Analysis: Q2, Q3¶

As we look into exploring the dataset for city specific insights we first list out the cities for which samples are recorded. We see that the unincorporated areas of the San Diego county are listed as a single municipal area. It is important to also note the fact that cities within the county vary widely by both population and geographical area.

display(HTML("<b>Geo Name</b>: {}".format(list(df.GeoName.unique()))))

pd.options.display.float_format = '{:,.0f}'.format

cols = ['Total','Age0_14','Age15_24','Age25_44','Age45_64','Age65Plus']
prop = ['GeoName','Year','OUTCOME']

df_city = df.groupby(prop, as_index=True, group_keys=False)
df_city = df_city.apply(lambda x: x.set_index(prop))
df_city = df_city[cols].fillna(0)
display(df_city.head())
display(df_city.tail())

df_city_totals = df_city['Total'].xs(2015,level='Year').unstack()
df_city_totals.plot(kind='bar', stacked=True, colormap='Set1')

fig = plt.gcf()
fig.set_size_inches(18, 6)

plt.legend(df_plot.columns, prop={'size': 12, 'weight': 'regular'})

ax = plt.gca()
ax.grid(False)
ax.set_facecolor('white')
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)

ax.xaxis.label.set_visible(False)

title = "San Diego Cities: Flu Trends Yr 2015"
ax.set_title(title,fontsize=14, fontweight='bold',color='black')
ax.title.set_position([.5,1.05])

Once we split the data by municipality and plot them, as above, we see how this difference in population and area plays out. The Unincorporated Areas and the City of San Diego together account for a large share of those afflicted by the flu. But given differences in population of the various cities it is best to plot the rates of flu affliction to determine if these cities are indeed prone to bigger outbreaks of the flu than the others.

df_city_sd = df_city.xs('SAN DIEGO',level='GeoName',drop_level=True)
df_unincorporated = df_city.xs('UNINCORPORATED',level='GeoName',drop_level=True)

#df_city = df_city.drop(['SAN DIEGO','UNINCORPORATED'], level='GeoName')

fig, (ax1,ax2) = plt.subplots(1,2)

fig = plt.gcf()
fig.set_size_inches(18, 6)

df_city_sd['Total'].unstack().plot(kind='bar',stacked=True,ax=ax1,colormap='Set1')

title = "City of San Diego: Flu Trends (2011 - 2015)"
ax1.set_title(title,fontsize=12, fontweight='bold',color='black')
ax1.title.set_position([.5,1.05])

df_unincorporated['Total'].unstack().plot(kind='bar',stacked=True,ax=ax2,colormap='Set1')
   
title = "Unincorporated Areas: Flu Trends (2011 - 2015)"
ax2.set_title(title,fontsize=12, fontweight='bold',color='black')
ax2.title.set_position([.5,1.05])

for ax in [ax1,ax2]:
    ax.legend(df_plot.columns, prop={'size': 12, 'weight': 'regular'})
    ax.grid(False)
    ax.set_facecolor('white')
    ax.spines['top'].set_visible(False)
    ax.spines['right'].set_visible(False)
    ax.xaxis.label.set_visible(False)    
    locs, labels = xticks()
    ax.set_xticklabels(labels,rotation='30')

pd.options.display.float_format = '{:,.2f}'.format

cols = ['TotalRate','Age0_14Rate','Age15_24Rate','Age25_44Rate','Age45_64Rate','Age65PlusRate']
prop = ['GeoName','Year','OUTCOME']

df_city = df.groupby(prop, as_index=True, group_keys=False)
df_city = df_city.apply(lambda x: x.set_index(prop))
df_city = df_city[cols].fillna(0)
df_city.xs('SAN DIEGO',level='GeoName')

df_city_rates = df_city['TotalRate'].xs(2015,level='Year').unstack()
df_city_rates = df_city_rates.sort_values(['ED Discharge','Hospitalization','Death'],ascending=False)

fig, (ax1,ax2) = plt.subplots(1,2)
fig.set_size_inches(18, 12)

df_city_rates.plot(kind='barh', stacked=False, ax=ax1, colormap='Set1')

ax1.legend(df_plot.columns, prop={'size': 12, 'weight': 'regular'})
ax1.grid(False)
ax1.set_facecolor('white')
ax1.spines['top'].set_visible(False)
ax1.spines['right'].set_visible(False)

ax1.yaxis.label.set_visible(False)

title = "San Diego Cities: Flu Rates Yr 2015"
ax1.set_title(title,fontsize=14, fontweight='bold',color='black')
ax1.title.set_position([.5,1.05])

df_city_totals['NonFatalities'] = df_city_totals['ED Discharge'] + df_city_totals['Hospitalization']

labels = df_city_totals.index
sizes = df_city_totals.NonFatalities.fillna(0).values
#cs=cm.Set1(sizes/len(sizes))
cmap = cm.OrRd
cs = cmap(np.linspace(0., 1., len(labels)))

pie_whole = ax2.pie(sizes, labels=labels, autopct='%1.1f%%', shadow=False, startangle=90, colors=cs, labeldistance=1.05)
ax2.axis('equal')

for slice in pie_whole[0]:
    slice.set_edgecolor('white')

title = "San Diego Cities: Flu (Non-Fatalities) Yr 2015"
ax2.set_title(title,fontsize=14, fontweight='bold',color='black')
ax2.title.set_position([.5,1.05])

Analysis: Q3¶

We've seen both flu incidences as well as the rate of flu afflictions for various age groups across both different time windows as well as geographies as part of the analysis for Q1 and Q2. Here, we split the data on age take a look at how they are distributed across the various age-groups.

pd.options.display.float_format = '{:,.0f}'.format

total = ['Total']
seniors = ['Age65Plus']
nonseniors = ['Age0_14','Age15_24','Age25_44','Age45_64']

nonfatalities = ['ED Discharge','Hospitalization']

# add a new column with aggregated nos across non-senior age groups
df_yr_totals['Age0_64'] = df_yr_totals[nonseniors].sum(axis=1)
# sum up non-fatal outcomes for each year and age-group
df_nf = df_yr_totals.loc(axis=0)[:,nonfatalities].groupby(level=['Year']).sum(axis=0)

# get this data-frame in the right format so we can concatenate it to df_yr_totals
df_nf['OUTCOME'] = ['NonFatalities'] * 5
df_nf.set_index('OUTCOME', append=True, inplace=True)
df_nf.reorder_levels(['Year','OUTCOME'])
#df_nf

df_age_totals = pd.concat([df_yr_totals,df_nf],axis=0,levels=['Year']).sort_index(axis=0)
df_age_totals.tail(n=8)

# sort columns here so we can mitigate errors resulting from 
# re-ordering caused by stack()
df_age_totals = df_age_totals.reindex_axis(sorted(df_age_totals.columns), axis=1)

df_plot = df_age_totals.unstack(level=-1)
# stack causes re-ordering of index values
# CR: https://github.com/pandas-dev/pandas/issues/15105
df_plot = df_plot.stack(level=0)

years = [2015,2014,2013,2012,2011]
groups = ['Age0_14','Age0_64','Age65Plus']
outcomes = ['Death','Hospitalization','ED Discharge']

df_plot = df_plot.loc(axis=0)[years,groups]
df_plot = df_plot[outcomes]

# convert to multi-index columns
#df_plot.columns.rename('GROUPS',level=0,inplace=True)

#levels = [list(i.values) for i in df_plot.columns.levels]
#levels = [['Stats']] + levels

#df_plot.columns = pd.MultiIndex.from_product(levels, names=(None, 'GROUPS','OUTCOME'))
#df_plot.head(n=3)

def add_line(ax, xpos, ypos):
    line = plt.Line2D([xpos, xpos], [ypos + .1, ypos],
                      transform=ax.transAxes, color='black')
    line.set_clip_on(False)
    ax.add_line(line)

def label_len(my_index,level):
    labels = my_index.get_level_values(level)
    return [(k, sum(1 for i in g)) for k,g in groupby(labels)]

def label_group_bar_table(ax, df):
    ypos = -.1
    scale = 1./df.index.size
    for level in range(df.index.nlevels)[::-1]:
        pos = 0
        for label, rpos in label_len(df.index,level):
            lxpos = (pos + .5 * rpos)*scale
            ax.text(lxpos, ypos, label, ha='center', transform=ax.transAxes)
            add_line(ax, pos*scale, ypos)
            pos += rpos
        add_line(ax, pos*scale , ypos)
        ypos -= .1

fig = plt.gcf()
fig.set_size_inches(18, 8)

ax = fig.add_subplot(111)

df_plot.plot(kind='bar',stacked=False,ax=fig.gca(),colormap='Set1')

#Below 3 lines remove default labels
labels = ['' for item in ax.get_xticklabels()]
ax.set_xticklabels(labels)
ax.set_xlabel('')

label_group_bar_table(ax, df_plot)
fig.subplots_adjust(bottom=.1*df.index.nlevels)

#ax.legend(df_plot.columns, prop={'size': 12, 'weight': 'regular'})
ax.grid(False)
ax.set_facecolor('white')
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)

title = "San Diego Cities: Flu Trends (2011-2015), Aggregates"
ax.set_title(title,fontsize=14, fontweight='bold',color='black')
ax.title.set_position([.5,1.05])

plt.show()

Insights¶

In any given year, the number of ED discharges outnumber the hospitalizations and both vastly outnumber the fatalities. This is true across all age groups except seniors (65 years and older) and across all municipalities represented in the dataset. Among seniors, the number of hospitalizations are likely to be closer to ED discharges, often overtaking them.
In the last 5 years spanning 2011-2015, 2013 and 2015 had larger outbreaks of flu than the other years, characterized by significantly higher rates of flu among all age-groups. Note: Such differences can also be the result of more robust data reporting and gathering in those years.
In the year 2015, the City of San Diego and the Unincorporate Areas accounted for a significantly larger share of all flu incidences (non-fatal) with approximately 42% and 14% of the total across all municipalities. This was a result of these regions having a greater total population than a bigger outbreak of the flu as revealed by inspecting the flu rates for the year. This is likely to be a pattern repeated each year, although this was not explored here.
In the year 2015, eight municipalities had higher rates of flu (non-fatal incidences) than the City of San Diego, with Lemon Grove having the highest rate.
Seniors are the dominant group affected by fatalities from flu with outcomes among non-seniors being dominated by ED discharges.

		Total	Age0_14	Age15_24	Age25_44	Age45_64	Age65Plus
Year	OUTCOME
2011	Death	6	0	0	0	0	0
	ED Discharge	1,742	548	400	476	197	69
	Hospitalization	310	32	7	35	49	120
2012	Death	5	0	0	0	0	5
	ED Discharge	1,199	384	189	352	150	56
	Hospitalization	171	21	5	13	28	68
2013	Death	40	0	0	0	0	30
	ED Discharge	4,579	1,916	596	993	685	355
	Hospitalization	768	69	8	47	105	471
2014	Death	28	0	0	0	8	9
	ED Discharge	3,286	973	511	920	614	243
	Hospitalization	549	52	8	65	143	214
2015	Death	44	0	0	0	0	44
	ED Discharge	5,186	2,170	695	1,020	723	550
	Hospitalization	786	65	7	35	102	517

		TotalRate			Age0_14Rate			Age15_24Rate			Age25_44Rate			Age45_64Rate			Age65PlusRate
		mean	amax	amin	mean	amax	amin	mean	amax	amin	mean	amax	amin	mean	amax	amin	mean	amax	amin
Year	OUTCOME
2011	Death	0.46	0.46	0.46	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
	ED Discharge	59.8394	88.54	20.72	107.512	189.07	71.28	94.5207	165.23	47.05	57.7264	83.94	32.61	33.263	52.05	19.4	32.908	55.68	22.45
	Hospitalization	10.29	16.5	4.76	10.1333	15.04	6.57	3.47	3.47	3.47	10.49	18.66	4.78	10.7	11.6	9.8	55.3933	69.87	32.19
2012	Death	0.34	0.34	0.34	0	0	0	0	0	0	0	0	0	0	0	0	3.16	3.16	3.16
	ED Discharge	43.3418	63.55	18.4	82.2808	153.37	43.54	65.398	128.62	31.62	51.0823	86.54	24.76	32.556	98.26	16.24	22.41	31.31	17.72
	Hospitalization	8.57857	14.31	5.29	6.49	7.13	5.85	2.31	2.31	2.31	2.905	3.92	1.89	7.59333	13.45	3.63	39.435	81.16	21.25
2013	Death	4.91667	10.99	1.74	0	0	0	0	0	0	0	0	0	0	0	0	28.5033	63.32	10.61
	ED Discharge	150.304	255.62	90.41	335.062	559.97	179.9	127.075	186.36	58.43	125.976	298.85	68.97	102.08	209.14	47.08	110.461	208.12	70.62
	Hospitalization	27.6975	54.22	16.36	17.9417	26.59	10.54	3.81	3.81	3.81	10.815	20.49	5.61	37.2783	119.51	13.97	130.921	214.86	77.64
2014	Death	1.84	3.28	0.93	0	0	0	0	0	0	0	0	0	2.51	2.51	2.51	5.72	5.72	5.72
	ED Discharge	98.0994	154.56	46.89	169.958	375.67	52.61	111.922	241.03	54.52	99.2947	211.63	33	73.9575	129.22	25.04	66.2023	121.25	40.96
	Hospitalization	19.03	30.55	10.05	14.304	26.31	8.53	3.88	3.88	3.88	13.5183	21.5	5.77	28.935	45.85	16.4	60.3191	102.18	42.57
2015	Death	3.3475	7.67	1.73	0	0	0	0	0	0	0	0	0	0	0	0	23.8075	48.04	14.27
	ED Discharge	163.573	374.35	67.15	379.297	800.89	133.87	154.313	302.2	82.25	130.378	307.22	56.39	103.512	207.57	31.81	157.35	348.34	85.5
	Hospitalization	27.6073	71.84	9.74	13.19	20.68	8.83	3.42	3.42	3.42	7.95667	12.2	4.63	23.705	37.65	15.01	149.985	501.77	48.91

		TotalRate	Age0_14Rate	Age15_24Rate	Age25_44Rate	Age45_64Rate	Age65PlusRate
Year	OUTCOME
2011	Death	0.46	0.00	0.00	0.00	0.00	0.00
	ED Discharge	52.05	82.85	74.20	50.98	26.19	26.44
	Hospitalization	11.03	8.79	3.47	4.78	11.60	43.66
2012	Death	0.34	0.00	0.00	0.00	0.00	3.16
	ED Discharge	32.34	54.61	34.35	31.53	19.51	22.85
	Hospitalization	5.66	5.85	2.31	1.89	5.70	21.25
2013	Death	1.74	0.00	0.00	0.00	0.00	11.58
	ED Discharge	138.59	301.65	131.48	100.85	88.56	102.13
	Hospitalization	23.26	11.53	3.81	6.13	13.97	137.64
2014	Death	1.31	0.00	0.00	0.00	2.51	5.72
	ED Discharge	102.91	147.55	114.34	95.72	88.91	67.73
	Hospitalization	17.23	8.53	3.88	5.77	23.03	68.72
2015	Death	1.73	0.00	0.00	0.00	0.00	14.73
	ED Discharge	158.61	371.32	148.43	97.84	95.84	145.22
	Hospitalization	25.41	20.68	3.42	4.63	15.01	137.97

		Total	Age0_14	Age15_24	Age25_44	Age45_64	Age65Plus	Age0_64
Year	OUTCOME
2014	Death	28	0	0	0	8	9	8
	ED Discharge	3,286	973	511	920	614	243	3,018
	Hospitalization	549	52	8	65	143	214	268
	NonFatalities	3,835	1,025	519	985	757	457	3,286
2015	Death	44	0	0	0	0	44	0
	ED Discharge	5,186	2,170	695	1,020	723	550	4,608
	Hospitalization	786	65	7	35	102	517	209
	NonFatalities	5,972	2,235	702	1,055	825	1,067	4,817

	CONDITION	OUTCOME	Year	Geography	GeoType	GeoName	GeoID	Region	SES	District	...	Age0_14	Age0_14Rate	Age15_24	Age15_24Rate	Age25_44	Age25_44Rate	Age45_64	Age45_64Rate	Age65Plus	Age65PlusRate
0	Flu	Death	2015	San Diego County	NaN	SAN DIEGO COUNTY	99	NaN	NaN	NaN	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	60.0	14.49
1	Flu	Death	2015	Central Region	Region	CENTRAL	3	CENTRAL	NaN	NaN	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	6.0	12.06
2	Flu	Death	2015	Central San Diego	SRA	CENTRAL SAN DIEGO	1	CENTRAL	Low Income	Supervisorial District 4	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN

			Total	Age0_14	Age15_24	Age25_44	Age45_64	Age65Plus
GeoName	Year	OUTCOME
CARLSBAD	2011	ED Discharge	53	17	9	15	6	5
	2011	Hospitalization	6	0	0	0	0	0
	2012	ED Discharge	42	18	0	13	5	0
	2012	Hospitalization	0	0	0	0	0	0
	2013	Death	0	0	0	0	0	0