Health Benefits and Cost-Effectiveness From Promoting Smartphone Apps for Weight Loss: Multistate Life Table Modeling

Background Obesity is an important risk factor for many chronic diseases. Mobile health interventions such as smartphone apps can potentially provide a convenient low-cost addition to other obesity reduction strategies. Objective This study aimed to estimate the impacts on quality-adjusted life-years (QALYs) gained and health system costs over the remainder of the life span of the New Zealand population (N=4.4 million) for a smartphone app promotion intervention in 1 calendar year (2011) using currently available apps for weight loss. Methods The intervention was a national mass media promotion of selected smartphone apps for weight loss compared with no dedicated promotion. A multistate life table model including 14 body mass index–related diseases was used to estimate QALYs gained and health systems costs. A lifetime horizon, 3% discount rate, and health system perspective were used. The proportion of the target population receiving the intervention (1.36%) was calculated using the best evidence for the proportion who have access to smartphones, are likely to see the mass media campaign promoting the app, are likely to download a weight loss app, and are likely to continue using this app. Results In the base-case model, the smartphone app promotion intervention generated 29 QALYs (95% uncertainty interval, UI: 14-52) and cost the health system US $1.6 million (95% UI: 1.1-2.0 million) with the standard download rate. Under plausible assumptions, QALYs increased to 59 (95% UI: 27-107) and costs decreased to US $1.2 million (95% UI: 0.5-1.8) when standard download rates were doubled. Costs per QALY gained were US $53,600 for the standard download rate and US $20,100 when download rates were doubled. On the basis of a threshold of US $30,000 per QALY, this intervention was cost-effective for Māori when the standard download rates were increased by 50% and also for the total population when download rates were doubled. Conclusions In this modeling study, the mass media promotion of a smartphone app for weight loss produced relatively small health gains on a population level and was of borderline cost-effectiveness for the total population. Nevertheless, the scope for this type of intervention may expand with increasing smartphone use, more easy-to-use and effective apps becoming available, and with recommendations to use such apps being integrated into dietary counseling by health workers.


Overview and purpose
This Technical Report provides the documentation on the Burden of Disease Epidemiology Equity and Cost Effectiveness (BODE 3 ) DIET models. i The first intervention model (IM) estimates the effect of a range of preventive dietary interventions on risk factors. The second, the BODE 3 DIET multistate lifetable (MSLT) model, estimates the effect the change in risk factor has on health impacts and cost impacts of a range of interventions in the New Zealand population, with the ability to examine heterogeneity by sex, age and ethnicity.
 By health impacts, we mean a range of metrics. The primary metric is quality adjusted life years (QALYs) gained (or perhaps lost) by the intervention compared to modelled business as usual (BAU), but the following can also be outputted: mortality rates, morbidity rates, life years gained, and disease incidence.
 By cost impacts, we mean two levels. First, and as the main or default option, health system perspective costs. This is the net of both the intervention cost (e.g. the cost of a new law for new taxes or the cost of dietary counselling by practice nurses) and the downstream costs averted (or incurred) in the health system due to changing disease incidence and prevalence. Second, societal cost impacts, most notably productivity costs. ii This adds to the health system costs those costs due to gains (or losses) in productivity in the labour force through keeping people healthy to work. We plan to extend this to welfare benefit costs. Greenhouse gas emissions and other 'costs' will also be considered. We do not, however, extend out to monetary value of life as this is partially captured in the QALY metric.
 By preventive dietary interventions we mean public health or similar interventions that have the potential to change future dietary-related disease incidence. We consider these as two types of intervention: dietary interventions directly changing a 'risk factor', such as dietary counselling parametrized as directly changing body mass index (BMI; Section 1.02); dietary interventions that change food consumption or composition (and then change risk factors; Section 1.03).
 By heterogeneity by sex, age and ethnicity we mean that model outputs will be examined and contrasted by these demographic groups. Why? Several reasons: we are interested in the ability of population-wide interventions to reduce ethnic inequalities in health (and socioeconomic inequalities in the future); intervention effectiveness varies by background epidemiological parameters (e.g. if the cardiovascular disease [CVD] rate for a group is high, they stand to gain more); gains in QALYs (intervention effect held constant) will differ by background mortality and morbidity rates.
The conceptual structure of the combination of both of the BODE 3 models is shown below in Figure 1. Specific dietary interventions lead to change in foods consumed, and then to change in nutrients and physiological markers that in turn lead to changes in disease incidence. The dietary interventions are 'channelled through' selected foods (fruit and vegetables, sweetened sugary beverages (SSBs)), 9 group codes were matched to an assortment of Nutritrack food products to provide price information for each food group. Product price was considered when matching food groups to the corresponding food products to ensure the range of products were most appropriate in terms of cost. Each food group required matching to at least one food product, and where possible food groups were matched to at least 10 different food products. v Where there were a limited number of appropriate Nutritrack food products available to match to NZANS food groups, food products were duplicated to allow more than one NZANS food group to be matched to one Nutritrack product. Food groups that reflected recipes (for example, casseroles/stews with sauce only) were matched to the most appropriate food products resembling the same or similar food components, and with probable similarities in terms of cost.
Prices for food groups that could not be matched to Nutritrack data (collected between December 2010 and April 2011) were obtained from online supermarket data. This included food products such as fresh fruit, vegetables and meat and poultry. The prices for these food products were obtained using the Countdown online supermarket (http://shop.countdown.co.nz). An unweighted average price was calculated across a range of food products considered to be most commonly consumed to obtain an average price for that food. Prices obtained from the online supermarket (year 2014) were scaled using the CPI to reflect 2011 prices.

Section 1.02. Dietary interventions parameterised as directly changing a risk factor
Here we focus on the 'general' modelling of dietary interventions onto 'risk factors' (the risk factors currently in the BODE 3 DIET MSLT model are fruit, vegetables, sugar-sweetened beverages, sodium and polyunsaturated fat intake and BMI). The parameterisation of these interventions is straight forward in principle. Firstly, we need to determine, based on the current best evidence, the effect of the intervention on the particular risk factors. For example, how much does BMI decrease with a mHealth weight loss intervention? To determine the effect size we perform a literature search, and possibly expert knowledge elicitation (see BODE 3 Protocol for a general approach 4 , and specific publications for specific approach).
Then, there are a number of other factors to consider and parameterise: -Who does the intervention effect? (e.g. just obese? Or everyone?) -Is there any heterogeneity of effect size by population characteristics? (e.g. sex, age) -What proportion of this population takes up and also completes the intervention? -What attenuation of effect is there over time? (E.g. informed by a literature search, and probably at least some expert knowledge elicitation (empirical estimates often sparse, and for short follow-up only), to determine the attenuation.) -Any heterogeneity of attenuation by population characteristics? (e.g. sex, age; however, it is most unlikely that enough information will be available to specify such heterogeneity of attenuation).
This intervention effect size and attenuation, with attendant uncertainty, is then modelled as an absolute change in risk factor (e.g. a 0.2 absolute unit change in BMI), for the intervention population of interest (e.g. all people 65 years and older, all people with a BMI ≥ 30). This intervention effect size is then applied to each relevant category of risk factor (e.g. if the intervention was targeted at people with a BMI over 30, then only to these groups; see PART 2 later for more detail on how this 'feeds into' the population impact fraction (PIF) estimation using a 'relative risk shift method').

Section 1.03. Dietary interventions that change food consumption or composition
For the purposes of this Technical Report, we consider two types of intervention here: price changes from taxes and subsidies (Section 1.03.1); and food reformulation by the food industry (Section 1.03.2). Food taxes and subsidies are complex to model, and dominate this Section.

Food taxes and subsidies
The BODE 3 intervention model, which merges food price changes with price elasticities to generate changes in 346 foods consumed, is complex. The conceptual process is that a change in price of food(s) leads to change in purchasing (and in parallel consumption), modelled through price elasticities (PEs). This change in consumption then leads to percentage changes in food (vegetables, fruit, SSBs) and nutrient (sodium, PUFA) and total energy intake, which in turn changes disease incidence. The most complicated component is the change in food price to change in consumption, through price elasticities, for reasons such as: -There are many possible foods that can have a price change, yet price elasticities are only (usually) calculated for aggregate groupings of foods. -For any single food with a price change, one has to not only model its own change in purchasing/consumption (through own-PEs), but also how the change in this food effects consumption in all (or some) other foods (through cross-PEs). -Price elasticities are calculated as a system in a different context to that in which they are applied in modelling. For example, the starting consumption of foods may differ between the context in which the PEs were calculated, compared to the population to be modelled. For a price set change (especially if large and/or affecting multiple foods; e.g. a saturated fat tax) the predicted purchasing/consumption of many foods changes, and it is possible to see 'implausible' changes in energy intake. Put another way, the PE modelling may 'correctly' see decreases and increases in consumption of foods relative to one another, but the net energy intake change may be implausibly large.
Yet food taxes and subsidies are a key public health research question, and using price elasticities is usually necessary. We address some of these issues in this Technical Report.
This Section is structured as follows: -Price elasticities: o Disaggregating price elasticity matrices o Theoretically selected price elasticities -Calculating the estimated change in consumption for a given price change -Constraining total food expenditure change

(i) Price elasticities
In this section we: 1. Outline the method used to disaggregated a 24 by 24 price elasticity matrix (from the SPEND Study 5,6 ; Figure 2, page 13) into a 338 by 338 price elasticity matrix. The reasons for this disaggregation is that our price interventions will differentially effect price within each of the 24 aggregated food categories (e.g. a saturated fat tax based on grams of saturated fat per 100g of product would not affect the price of each food sub-type (e.g. low and high fat cheese) by the same percentage within each aggregated food category (e.g. dairy products)). We mainly used theoretical means to do this, as empiric data are limited or non-existent. Price change, per gram of saturated fat, is based on the food composition data of the specific food groups. 2. Outline a basis upon which to theoretically 'set' many cross-PEs to zero for use as either the 'best' or scenario analyses (a choice that will be made in subsequent publications). The reason for doing this is because even a small (but erroneous) price elasticity from, say, dairy to fruit may just add more error to modelling, whereas theoretical setting of some cross-PEs to zero may improve subsequent modelling -or at least provide a useful scenario or sensitivity analysis. Many published modelling studies theoretically suppress cross-PEs (e.g. [6][7][8] ). 3. Outline a method to scale all purchases up or down by the same percentage, after modelling through disaggregated price elasticities. The reason for doing this is that even with our best efforts above to specify PE matrices, one may still end up with an implausible change in total food expenditure (and total energy intake). For example, a 10% increase in average food prices due to a saturated fat tax may result in no change in food expenditure (and a 10% reduction in energy intake) through the above PE matrix modelling. Yet there is an elasticity of total food expenditure given change in total food price; an envelope within which redistribution between foods must operate. There are also reasons from econometric theory why such an envelope is sensible to invoke, namely that if the prices of many foods changes then expenditure on food (in total) now has to also consider the total household budget and income elasticities (e.g., for some budget-constrained families higher food prices may result in total reductions in the amount of food purchased).

(ii) Generating disaggregated PE matrices
Initial price elasticities were from the SPEND Study, conducted for New Zealand. 5,6 These are in a 24 by 24 matrix (see Figure 2, page12) of own-and cross-PEs (with standard errors for default uncertainty). These 24 food groups have been matched to the 346 food groups used in the intervention model. This gives us 24 overall food groups and 338 food subgroups (ignoring 5 'alcoholic beverage' groups, 2 'dietary supplement' groups and 1 'not applicable' group). The 24 by 24 price elasticity matrix was then expanded to a 338 by 338 matrix as follows: -Own-PEs: Econometric theory posits that as one keeps disaggregating foods into smaller and smaller subgroupings, the own-PE of each food is expected to increase (in absolute value terms). [9][10][11][12] For example, the own-PE of all bread might be -0.5, but wholegrain bread separated might be -0.55. Why? Because, assuming subgroups in each aggregated category are substitutes, changing the price of just white bread means consumers can swap to multigrain bread, meaning that consumers can be more price sensitive (a larger, in the negative sense, own-PE). How much does the own-PE strengthen? Unfortunately, that is difficult to estimate. What we have done is assumed that the own-PE increases by 2.5% (with wide uncertainty expressed as a 50% (of 2.5% = 1.25 percentage point) standard deviation (SD) on the normal scale) for each additional food sub-group. Of note, the own-PE increases by 5% if splitting one category in two (we deliberately allow a greater increase in own-PE for the first split), but then 2.5% for each additional food category thereafter. (Whilst theoretical literature can be found to support the fact that own-PE increases with disaggregation, [9][10][11][12] we were unable to find empirical research on the same for food. We therefore plan to undertake such analyses ourselves in the future with data collected from a virtual supermarket experiment in the Price ExaM study (within the DIET Programme; https://diet.auckland.ac.nz/content/price-exam) for which we can change the level of food disaggregation in calculating own-PEs -and then amend our 2.5% estimate as appropriate.) The overall sensitivity of the modelling to this parameter will be investigated and reported with oneway uncertainty analyses and Tornado plots (e.g. of QALYs gained).
-Cross-PEs within the initial food group. We assume that each food subgroup (e.g. four bread subtypes of white bread, fibre-containing white bread, wholemeal bread, wholegrain) within each separate food (e.g. bread) is a substitute for each other, meaning they have small positive cross-PEs. We specify all these, so the sum (across rows of PE matrix) of own-and cross-PEs gives the SPEND Study's own-PE, following econometric theory. 12,13 For example, if as above the own-PE of breads as one aggregated food category was -0.5, but when disaggregated the four subcategories of bread each had an own PE of -0.55 then the sum of: -Wholegrain bread's own-PE (-0.55) -and each of the three cross-PEs of white, fibre white, and wholemeal onto wholegrain … must be -0.5. Meaning the sum of the three cross-PEs must be +0.05. We disaggregated this quantum across the three non-wholegrain breads proportional to their consumption (i.e. the cross-PE of a commonly purchased item on x is greater than the cross-PE of a rarely purchased item on x). For example, assume that the percentage consumption of the three non-wholegrain breads was white=50%, fibre white=20%, and wholemeal=30%, then the cross-PEs for white onto wholegrain would be 50%×0.05 = +0.025, for fibre white = 0.01, and for wholemeal = 0.015.
Note that, thus far, we have two main assumptions: first, that own-PEs increase by 2.5% (with wide uncertainty) for each additional sub-category of food; second, that the disaggregation of cross-PEs is proportionate to that food's relative consumption. These assumptions are qualitatively justified based on econometric theory, but the exact quantification (or weighting) is unknown, and needs empirical testing (and in the meantime uncertainty or scenario analyses). vi -Cross-PEs for food sub-categories of food in different aggregate categories. (For example, for each of four breads (e.g. white, fibre white, wholemeal and wholegrain) onto any fruit.) Again, the cross-PE from the aggregated categories needs to be disaggregated by food sub-category, and we assume weighted by consumption. So, extending the above example, the cross-PE of aggregated bread onto fruit is 0.016 from the SPEND PE matrix (Figure 2, page 13). Assume wholegrains were 20% of all bread expenditure (the percentages above excluded wholegrain), meaning percentage expenditure on the three other breads within all breads is white=80%×50%=40%, fibre white=80%×20%=16%, and wholemeal=80%×30%=24%. And therefore, the cross-PEs for each of these four breads onto (any) fruit is estimated to be white=40%×0.16=0.064, fibre white =11%×0.16=0.026, wholemeal=24%×0.16=0.038, and wholegrain=20%×0.16=0.032.
Logic checking of the above was undertaken by determining changes in purchasing for various policies that had the same percentage price change on all food subtypes (e.g. all sub-types of bread and cereals) within each aggregate food category (e.g. bread and cereals combined) through the completely disaggregated and the 'simple' aggregated price elasticity matrix -identical results were obtained, as should be the case.
vi Assumptions implicit to price elasticity matrices include: -The homogeneity assumption: the sum of the cross-PEs for a product and the income elasticity for that product is zero. -The budget constraints assumption: the sum of the income elasticities weighted by the share of income spent on the goods is equal to 1. Further mathematical work by Scarborough and Blakely managed to meet this 'stricter' homogeneity assumption using an 'odds' method to calculate the cross-PE in this system (further information from authors; emails and workings August 2016). However: 1) whilst 'mathematically correct' for one system of disaggregated foods, implausibly high cross-PEs can result; 2) it was mathematically intractable to find a solution of linear equations to apply to a larger food system (as we need to in the BODE 3 intervention model). We also note that the application of PE matrices calculated in one setting (with a set of assumptions (e.g. conditionality, meaning no change in budget share for food)) to another setting (e.g. New Zealand in the future with different starting distributions of food consumption, tastes and preferences) is structurally uncertain -albeit unavoidable. Therefore, in the interests of model parsimony, we settled on the approach here detailed in the main text of this Appendix.

(iii) When empirical data on disaggregated PEs exists from other research
Finally, for soft drinks there were actual estimates of cross-PEs for regular and diet soft drinks available through a paper published by Sharma et al (2014) 14 (Australian study). These were rescaled to the SPEND carbonated beverages own-PE as follows: a) Assume that the relative distributions of own/cross-PEs in Sharma et al apply to New Zealand. b) Then imagine that diet and regular soft drinks have the same price increase/decrease, meaning that this 2 by 2 matrix should return what the single own-PE in SPEND returns. c) The SPEND own-PE is -1.23. Thus, we need to make the Sharma 2 by 2 matrix behave as if it were -1.23 in aggregate. We achieved this by a scalar based on budget share (using food consumption from the NZANS as a proxy).
The scalar is calculated as follows. First, the own-PEs (shaded cells) and cross-PEs from Sharma   regular soft drinks will give a 1.509% reduction in regular and a 0.670% increase in diet soft drinks. Given that regular soft drinks make up 83.2% of consumption, and diet ones 16.8%, the net change in soft drink consumption (due to change in regular prices only) will be (83.2% × -1.509%) + (16.8% × 0.670%) = -1.143%  diet soft drinks will give a 0.383% increase in regular and a -2.418% decrease in diet, and therefore a net change in soft drink consumption (due to change in diet prices only) of (83.2% × 0.383%) + (16.8% × -2.418%) = -0.088%  and, therefore, in both regular and diet soft drinks there will be a net change of: -1.143% + -0.088% = -1.23%, consistent with the 'starting' SPEND own-PE of -1.23.
This disaggregation was repeated using own-and cross-PE from a report published by Tiffin et al in 2011 15 in a sensitivity analysis.
Selected examples of expected (i.e. no uncertainty propagated through calculations) cross-and own-PEs for some of the food sub-types from the fully disaggregated PE matrix are shown in Table 1 below (using methods 1, 2 and 3 above for everything except the underlined block of disaggregated soft drink PEs which uses method 4 above, for Sharma et al (2014) external data), and can be contrasted with the more aggregated SPEND PEs shown in Figure 2, page 12.

(iv) Theoretically selected cross-PEs
We updated the literature review from our previous work 16 to include PE studies for high-income countries (mainly UK, US and Australia). We searched Ovid database with the keywords: "Price elasticit$" AND "Food$" OR "Drink$" OR "Beverage$", NOT tobacco, NOT alcohol, from 2000 onwards (English language, human, full text). Studies that just estimated price elasticities in low or middle income countries were ignored. These studies had to report cross-PEs between at least two food groups (given we are interested in cross-PEs). There were 11 studies that meet our search criteria. 9,[17][18][19][20][21][22][23][24][25][26] We matched food groups from the selected studies with the BODE 3 intervention model's food groups. Then all PEs from these studies were extracted to a database. Median cross-PEs from this database were selected as the best cross-PE for each food group pairing in the PE matrix (There were some outliers in the data so we decided not to use average cross-PEs, the majority of the cross-PE had three or more estimates). We refer to these selected cross-PEs as the BODE 3 cross-PEs (as opposed to the SPEND cross-PEs). We also classified cross-PEs as a weak, medium or strong association. That is: These values were estimated from our PE database above, with weak association accounting for the lower 25 th percentile, strong association for the upper 25 th percentile, and medium association being the rest.
For modelling of the impact of price changes on food purchasing/consumption, we will use three general approaches, each with alternative options or scenarios within it:

Approach A: use SPEND PEs
In this approach we will simply use all SPEND own-PEs and cross-PEs (i.e. no suppression of any cross-PEs, use standard errors about each own-PE and cross-PE as initial uncertainty intervals to draw from in Monte Carlo simulation).
Suppress selected cross-PEs as sensitivity analyses:  suppress (i.e. set to 0) those SPEND cross-PEs that in the above mentioned literature review we classified as 'weak', i.e. where the BODE 3 |cross-PE| ≤ 0.04(AS1, see Appendix B: SPEND Study price elasticity tables, page 72);  suppress those SPEND cross-PEs that in the above literature review we classified as 'weak' or 'moderate', i.e. where the BODE 3 |cross-PE| ≤ 0.09 (AS2, see Appendix B: SPEND Study price elasticity tables, page 72).
 suppress those SPEND cross-PEs as 'theoretically' determined by previous users 6,8 of SPEND price elasticities (AS3, varied by policy and will be described in detail if used).

Approach B: use BODE 3 (cross) PEs
In this approach we will retain SPEND own-PEs, but use the median BODE 3 cross-PEs from the literature (BS1).
Suppress selected cross-PEs as sensitivity analyses:  suppress (i.e. set to 0) those BODE 3 cross-PEs that in the above literature review we classified as 'weak', i.e. where the BODE 3 |cross-PE| ≤ 0.04 (BS2, see Appendix B: SPEND Study price elasticity tables, page 72);  suppress those BODE 3 cross-PEs that in the above literature review we classified as 'weak' or 'moderate', i.e. where the BODE 3 |cross-PE| ≤ 0.09 (BS3, see Appendix B: SPEND Study price elasticity tables, page 72).
 Additional sensitivity analysis: Use the median BODE 3 own and cross-PEs from the literature (BS4, see Appendix B: SPEND Study price elasticity tables, page 72).
All the above Approaches used the above described disaggregation method (page 13) to move from the SPEND 24 by 24 matrix to the fully disaggregated 338 by 338 matrix.

(v) Calculating change in consumption for a give price change
Whilst the matrices are large, and there is uncertainty in the own-and cross-PEs (that is uncertainty intervals about each own-and cross-PE that are sampled from during Monte Carlo simulation), the actual mechanics of calculating the change in consumption is fairly straight forward. Imagine that there are only three food groups, A, B and C Next, assume that the PE matrix is as follows: This means that for each 1% increase in price of A, consumption of A will reduce by 0.7% (own-PEs, shaded), but consumption of B will increase 0.02% and consumption of C will increase by 0.15% (cross-PEs). And so on.

Food groups
Assume that A has a 20% increase in price, B a 10% increase in price, and C no change in price. Next, assume that initial consumption of A was 500g, B 200g and C 100g. Then the post price change consumption will be: This gives change in grams. Whilst we are using consumption data in grams, not purchasing data in grams, as long as one assumes that wastage (i.e. the percent of food purchased that is not consumed) is similar between baseline and intervention, one can safely convert to percentage change after working with grams in the actual calculations. We acknowledge that this is a simplifying assumption about wastage).

(vi) Constraining total food expenditure change
The price elasticities used in this model were calculated from a subset of the New Zealand population, with internationally sourced cross-PEs for scenarios BS1 to BS3 and internationally sourced own and cross-PEs for scenario BS4, and do not 'fit' perfectly to the consumption data from the NZANS used in this model. Moreover, the price elasticity values we use are from 'conditional' models, where the total expenditure on food is assumed fixed; if the interventions we model substantially change prices and therefore overall expenditure on food, we need to allow for how much total food expenditure changes as a result of price changes. These two problems can lead to implausible changes in food expenditure and energy intake if the price elasticities are naively used without constraints.
To address this issue, we need to consider how total food expenditure changes as a result of substantive changes in food prices. Theoretically, we would not expect the TFE e to exceed 1.0. If it did exceed 1.0, this would suggest that as food prices increased expenditure increased even faster -clearly implausible on a fixed household total budget. Conversely, it seems unlikely that the TFE e is less than 0, as food is essential to our existence. Accordingly, the naïve upper confidence limit of 1.21 from the Michelini (1999) derived TFE seems implausible -it should be less than 1.0. Table 2 (page 22) presents TFE e estimates for eight studies that used multi-stage budgeting models to estimate unconditional and uncompensated food own-PEs, for high-income countries up to June 2017 (keywords: "price elasticities" or "price elasticity" or "demand" and "food" and "multi-stage" or "multi stage", mainly Google Scholar) Consistent with theoretical expectation, all estimates were between zero and one -albeit spanning this entire range. The previous New Zealand study estimated a TFE e of 0.68, a bit less than 0.832. The average, median and standard deviation across these eight studies were 0.59, 0.66 and 0.29, respectively. In the absence of an ideal (let alone perfect) recent New Zealand study, we elected to specify a Beta distribution to estimate the TFE e , a Beta distribution was chosen as the value needs to be between 0 and 1. Values for alpha and beta were varied in order to return a mean of close to the New Zealand literature and were set to 6 and 2. This returns a mean of 0.   There was one additional prior step required too. Changing total household expenditure on food is equivalent to an income change for food consumption. Therefore, income elasticities for each food category were also applied. This step made little relative difference to food expenditure, and everything was still scaled to the 'set' new expenditure based on the TFE e and percentage change in food price index.
In summary, given our (necessary) reliance on: a) less than ideal price elasticity matrices; b) baseline food consumption distribution in our simulation studies that are not the same as that used in price elasticity estimation and; c) simulated food price interventions that will change the food price index by more than a trivial amount, it was necessary to 'set' the new total expenditure on food. To not do so would have risked implausible changes in total food expenditure and -importantly for final estimation of health gains -implausible changes in food energy intake. We specify generous uncertainty about the TFE e , as it is genuinely uncertain. Finally, the TFE e essentially just scales all food purchasing up or down by the same amount; the relative impact on food consumption from the PE matrix is preserved (e.g. the effect of a saturated fat tax decreasing fatty food purchasing but increasing non-fatty food purchasing, relative to each other, is preserved).

Food reformulation
The methods used for food reformulation will be expanded in future versions of this Technical Report. In principle, the approach will be: 1. Specification of the policy option, and what foods/nutrients it targets. 2. Estimation of how much individual food product, or nutrient amounts directly, change as a result of the policy. This will be fed into the foods, and resultant changes in risk factors, from baseline, will be estimated. These are likely to be for nutrient risk factors and BMI only (i.e. for sodium, PUFA and BMI).

Risk factor distributions
There are currently six risk factors generated in the BODE 3 intervention model that flow into the BODE 3 DIET MSLT model; change in BMI, intake of fruit (grams/day), vegetables (grams/day), sugarsweetened beverages (SSBs, mls/day), sodium (mgs/day) and polyunsaturated fat (as a percentage of total energy (%TE)) between baseline intake and intervention intake.
Changes in consumption from baseline to intervention are calculated separately for Māori and non-Māori, males and females, but due to data limitations could not usually be further calculated by agegroups. We treat this (necessary) simplification as satisfactory for estimating average changes across ages, and from there the percentage change (of baseline intake). But given that there are some important age variations in risk factor distributions (e.g. SSBs more commonly consumed by young people), it was necessary to use the 'all ages percentage change' to in turn estimate grams or mls change by age.
This percentage difference is applied to the average consumption for the specific age-groups (15-25, 25-35, 35-45, 45-55, 55-65, 65-75, 75-85 and 85+) giving a change in intake in grams (for fruit and vegetables) or mls (for SSBs) specific to each sex, ethnic and age-group. Change in sodium uses the change in grams for all the different food groups and the sodium content of these foods (outlined in Section 1.01.1) to calculate a change in mg of sodium. This is also calculated by sex and ethnic groups and estimated as above for age groups. The percentage of total energy (%TE) from polyunsaturated fat is calculated for baseline and intervention. The change in %TE from polyunsaturated fat is the risk factor that flows through to the BODE 3 DIET MSLT model and is not differentiated by age-group.  Table 3, page 25). We applied the estimated percentage change to the grams per day by age-group (within Māori males) given by the NZANS 2 . Accordingly, absolute consumption of SSBs was estimated to decrease (under the 10% SSB tax intervention) by a minimum of 2.35mls per day for the elderly, and a maximum of 54.54mls per day for young Māori males. *As a result of the intervention (with TFE e switched on) average intake (for the four demographic groups as a whole) changed by this absolute amount. **The absolute change was converted to a percentage change that was then applied to the baseline intake of the specific age-groups to give an estimate of absolute change by age. For all risk factors except BMI the change occurs in the first year, for BMI it takes 2 years for the full BMI change to occur (see section 2.01.1 for details). For taxes and subsidies the change in risk factor is then maintained for the length of the tax/subsidy. For one off interventions the initial effect starts to decay after the first year (or 2 in the case of BMI, see section 1.01.08 for details).

Change in BMI
Change in BMI is calculated through a change in energy intake from baseline to intervention. As outlined in the Nutrients section on page 7, baseline consumption is matched to the energy content of the foods consumed. As consumption increases or decreases so does the energy intake.
Change in energy intake is converted to change in kg and change in BMI using the formula presented in Hall et al (2011). 34 This paper critiques the commonly used 'static weight-loss rule': reduction of food intake of 2mJ/day will lead to a steady rate of weight loss of 0.5kg/week. This Hall et al method takes into account the dynamic physiological adaptations that occur with decreased bodyweight, and quantifies the effect of energy imbalance on bodyweight using mathematical modelling: reduction of food intake of 100kJ/day will lead to a change of 1kg with half of the weight change reached in 1 year and 95% by year 3. This is operationalised in the BODE 3 DIET MSLT model as 50% of the change in BMI in the first year, then 100% of the change by the second year, and then with subsequent weight change either held constant or decayed (due to decaying intervention effect) over time.

PART 3. Disease Modelling
Part 3, Disease Modelling in the BODE 3 DIET MSLT model, is presented as four sections: 1. Section 3.01 outlines the structure of the BODE 3 DIET MSLT model. 2. Section 3.02 outlines the baseline specification and parametrization of the model. In other words, how the mortality, morbidity and cost parameters are expected to behave under 'business as usual' (BAU). 3. Section 3.03 presents model calibration. 4. Section 3.04 presents model validation. 5. Section 3.05 briefly outlines analysis. 6. Section 3.06 provides an additional note on why we use disability-adjusted life-years (DALYs) and QALYs interchangeability in the context of simulation modelling.  Everyone still alive in each cycle of the model (more specifically, the alive proportion for whichever five-year cohort is currently being modelled) is represented in the main life-table.
In this main life-table, age-specific all-cause mortality and morbidity rates are applied in each cycle to the 'alive cohort', until the age of 110 years when all remaining alive people are assumed to die. As such, the sum of QALYs can be tallied.
 In parallel, proportions of the cohort can simultaneously reside in one or more parallel disease-specific life-tables or states. Or put more correctly, multiple disease states are modelled independently. vii Within these disease-specific life-tables, disease incidence rates, remission and case-fatality rates, and disease-specific morbidity (disability weights from the New Zealand Burden of Disease Study (BDS) 35 and GBD 36 ), and disease-specific costs, are modelled.
vii With the exception of diabetes, which has been 'linked' to coronary heart disease and stroke states (See section 1.01.09 for details).
 The disease-specific life-tables have both a BAU and intervention model. The latter intervention model differs from the BAU model, in that incidence rates are changed (usually lowered) based on population impact fractions (PIFs; a 'merging' of changes in risk factor distributions and relative risks; see 3.01.4 later in this Technical Report). This allows a calculation of differences in disease-specific mortality and morbidity rates, and differences in disease-costs per capita.
 These differences are then summed across all parallel disease states, and added or subtracted to the all-cause mortality and morbidity rates in the main life-table and captured as cost differences between BAU and intervention, allowing estimation of QALYs gained (or lost) and health system cost change between the BAU and intervention scenarios for the population overall -the main objective of the modelling.

Figure 4: Schematic of a proportional multi-state life-table, showing the interaction between disease parameters and life-table parameters, where x is age, i is incidence, p is prevalence, m is mortality, w is disability-adjustment (or health status valuation), q is probability of dying, l is number of survivors, L is life years, Lw is health adjusted life expectancy (HALE), and where '-' denotes a parameter that specifically excludes modelled diseases, and '+' denotes a parameter for all diseases (i.e. including modelled diseases). 37
(page 30) is an alternative way of presenting a proportional multi-state life-table structure. There are numerous 'disease processes' that are modelled independently, and the total population 'experience' (in this case shown as health-adjusted life expectancy, or quality adjusted life expectancy) is a sum of these disease process contributions, and the mortality and morbidity experience due to all remaining diseases considered as one 'residual entity'. The way the intervention simulations work (not shown directly in the figure below) is to calculate changes between BAU and intervention scenarios in mortality, prevalence and disability rates for each disease process (due to changing disease incidence rates in each disease process), and then 'sum' these changes to calculate new total population (i.e. in the main life-table) mortality, prevalence and disability rates. And from here one derives a change in quality adjusted life years lived by the cohort.
Other outputs like change in total mortality rate can also be outputted. Finally, health system costs can be 'attached' to the model structure in a similar way to disability or morbidity weights, allowing an estimation of change in health system costs due to changing disease epidemiology (see Section 3.02. 5).   .

Diet-related disease models
Diet has been linked to increased incidence of various cancers (e.g. colorectal), cardiovascular diseases (e.g. coronary heart disease (CHD), stroke) and osteoarthritis through dietary impacts on BMI. These diseases were modelled, within each disease process or parallel disease state as above, using a set of differential equations that describe the transition of people between four states (healthy, diseased, dead from a disease in the model, and dead from all other causes), with transition of people between the four states based on rates of background mortality, incidence, case-fatality and remission ( Figure 5, page 32).

Figure 5: Each disease is modelled with four states (healthy, diseased, dead from the disease, and dead from all other causes) and transition hazards between states of incidence, remission, casefatality and mortality from all other causes.
The default model structure was that diseases were modelled independently. Specifically, the sex-, age-and ethnic-specific incidence, remission, and case-fatality rates for each disease were modelled independently, e.g. the incidence rate for colorectal cancer did not vary with changes in the incidence rate (or prevalence) of kidney cancer. However, we include dependency for diabetes as a disease state, essentially treating it both as a disease state and a risk factor itself for coronary heart disease and stroke. Given this 'both a disease and risk factor' treatment of diabetes, we defer describing this model structure until after describing how risk factors are treated (i.e. Section 3.01.5).

How changes in risk factors change disease incidence
Health and cost impacts of simulated interventions are achieved by interventions changing risk factors (e.g. BMI) which in turn change disease incidence. This is similar to comparative risk assessment, and indeed involves 'shifts' in risk factor distributions that are merged with relative risks to determine PIFs, the percentage by which disease incidence is (usually) decreased. In this section we describe the model structure features, namely: 1. the risk factor  disease associations included in the model 2. the calculation of the PIFs 3. how decay (if any) in risk factor change is modelled over time 4. how time lags between risk factor changes and disease incidence changes are modelled.
(Actual relative risks used are given in Appendix E: Relative risks of diet to disease associations (page 96). How dietary interventions change risk factors was described in PART 1. Baseline data on risk factors was described in PART 2 Section 2.01.)

Healthy
Diseased Dead

(vii) Risk factor-disease associations included in the BODE 3 DIET MSLT model
Risk factors were included if they met the following criteria: -If they were assessed as a top risk factor (top 20) in Australasia (Australia/New Zealand) in the GBD 2010 Study. 39 -There are interventions we plan to model that can modify this risk factor. -There are data available: o Distributional data in New Zealand (e.g. NZANS) o RR data (to all key diseases; i.e. GBD sourced RRs preferable), and mutually adjusted for other risk factors in the model where possible. Table 5 (page 33) shows the risk factor-diet associations operating in the BODE 3 DIET MSLT model. All diet-disease associations that met the above criteria were included in the model with planned modifications for future versions of the model outlined in Table 6 (page 34).

GBD risk factors to be included in Model V2 Comment
Physical inactivity and low physical activity To be added in the next version of the model (V2).

Diet low in nuts and seeds
To be added in the next version of the model (V2).

Diet low in whole grains
To be added in the next version of the model (V2).

Diet high in processed meat
Ideally to be added in the next version of the model (V2). Firstly investigate the level of effect that is mediated through other risk factors currently in the model (e.g. Sodium). Add the risk factors into the model with appropriately modified RR.

Diet low in fibre
The effect of low fibre is completely mediated between 'diet low in whole grains, fruits and vegetables', risk factors either currently in the model or planned to be in the model (V2).

Diet low in seafood omega-3 fatty acids
There is no intake data for this risk factor in New Zealand.
Additionally SSB intake (ranked as the 31 st top risk factor in Australasia in the GBD 2010 Study 39 ) is included in the model due to the planned interventions that would impact on SSB consumption.

(viii) Calculation of PIF: Relative risk shift method
We modelled the health benefits of interventions through a reduction in incidence of each diet-related disease (Equation 4, page 35). The change in risk factor acts on the starting risk factor distribution by sex, ethnic and age groups. For each risk factor there are up to 10 categories of risk (e.g. For BMI: <20, 20-25, 25-30, 35-40, 40-45 and 45+; six categories). The proportion of the population for each sex, ethnic and age group that sits in each of those categories is obtained from the NZANS. This proportion, the category midpoint and the relative risk associated with that risk factor are mathematically combined with the effect size to calculate the PIF for each risk factor disease combination -not by shifting proportions of the cohort by category, but rather by shifting the RR to what it would be for the new midpoint of the same starting category under the intervention 40 (more below). Note that all calculations were done by age, sex and ethnicity, although we omit these subscripts from the following equations for clarity. where: x I = the current incidence of disease x in the population; ' I x = the new incidence of disease x after an intervention is implemented; and x PIF = is the population impact fraction for disease x.
A PIF 41 is derived for each risk factor disease combination. For example, for CHD there were PIFs for the association between each of fruit, vegetables, BMI, sodium, percentage of total energy from polyunsaturated fatty acids and CHD.
The PIF is calculated using the Relative Risk shift method. 40 This method changes the relative risk of the categories and keeps the proportion in each category constant. For example, if categories are formed for every 5-point increase on the continuous scale (e.g. BMI), and the RR per 5-point increase was 1.5, and the intervention lowers everyone's (and therefore the category midpoints) risk factor by 1 unit, then each categories RR is lowered by 0. where: n= the number of risk factors;

Scaling of risk factor distribution and category midpoints
For the majority of the risk factors the risk factor distributions are taken straight from the NZANS as described above, however additional scaling is done for Sodium and SSB intakes. Sodium intake data is scaled to sodium excretion data as described in Section 1.01.2. SSB intake data are scaled to approximate usual intake as described below.

SSB intake to approximate usual intake
The majority of risk factors in the DIET model are foods or nutrients that will be consumed on a daily basis. SSBs on the other hand are a periodically consumed food group. GBD relative risk estimates are based on SSB consumption as recorded by food frequency questionnaires, and therefore represent estimates for usual intake of SSBs. Data from a single 24-hr recall is unlikely to accurately represent usual consumption of SSBs. Firstly, a single 24-hr recall is likely to underestimate the proporiton of the population that consume some SSBs. Secondly, a single 24-hr recall is likely to overestimate the amount of SSBs consumed by individuals who do have SSB consumption recorded on the day of the survey. For these reasons, we rescaled SSB intakes from 24-hr recall data in NZANS to obtain a better estimate of usual population SSB intake.
We combined data from the overall NZANS sample with the subsample of the survey for whom two 24hr recalls were recorded. This allowed us to calculate the probability of being a SSB consumer, and (for consumers) the probability of consuming SSBs on any given day. At the individual level, we then predicted whether an individual was a true zero consumer and if not, we predicted a weekly frequency of SSB consumption. SSB intakes for (predicted) consumers were then scaled based on (predicted) consumption frequency to avoid overestimating SSB consumption in consumers. For example, an individual with 500ml SSB intake recorded in the single 24-hr recall with a predicted frequency of consumption of two days per week was assigned an estimated usual SSB intake of 143ml (1000ml estimated weekly total divided by seven). Estimates of usual intake for (predicted) consumers without consumption recorded in the single 24 hr recall were based on average recorded intake values for their age, sex, and ethnic group.
We simulated individual intakes 10,000 times and averaged across the runs to obtain estimates of population distributions of SSB intake. Each simulation randomly assigned different individuals with different frequency of consumption values, and also accounted for the survey standard error around initial estimates of the probability of ever-consumption and consumption on any given day.

Theoretical Minimum Risk Exposure Level (TMREL)
In the Comparative Risk Assessment (CRA) approach, attributable burden is calculated in reference to a counterfactual risk exposure. In this modelling the counterfactual used is the Theoretical Minimum Risk Exposure Level (TMREL). The TMREL is a theoretically possible level of intake that minimizes overall risk. This allows us to quantify how much of the disease burden could be lowered by shifting the risk factor distribution to a 'theoretically possible' level associated with the greatest improvement in population health 1 . As the evidence for the TMREL is uncertain for the risk factors modelled, a range or uncertainty interval about the TMREL is used rather than just a central estimate. For risk factors where lower BMI or intake decreases disease incidence (BMI, SSBs and sodium), in those categories whose midpoints are lower than the TMREL then there is no effect. For risk factors where higher intake decreases disease incidence (fruits, vegetables and polyunsaturated fatty acids) the method works in reverse; those categories whose midpoints are higher than the TMREL there is no effect, i.e. people are already receiving maximum benefit from their high consumption.

(ix) Modelling decay or attenuation of effect
Many interventions, such as dietary counselling, have attenuating effects. For example, a particular dietary counselling regime may change population average BMI by 0.1 unit initially, but over the next 'x' years the population tends to regain weight back to their BAU levels. The length and shape (e.g. linear or exponential to return back to BAU) of this decay is informed by evidence relevant to the specific interventions modelled e.g. Dasinger et al. (2007) 42 , and specified in the model.

(iv) Time lags
Changing diet does not usually rapidly change disease incidence; it takes time for disease incidence to change to a 'new equilibrium'. Evidence on time lags, and the shape of change in disease incidence, following dietary change, is very limited. Some simulation studies circumvent this by assuming the change in disease incidence is immediate. However, this will (grossly) over-estimate the effect of dietary intervention on cancer incidence (where time lags are likely to be decades, and moderately overestimated changes in cardiovascular disease (where time lags might be months to years). This issue of time lags is compounded by discounting (i.e., little net benefit might be seen with a cancer preventing diet where a high discount rate is used in the model).
The approach we used was to look back to the average (1-PIF) reflecting the average change in risk factor in a past window of exposure. For example, the relevant time of exposure to increased fruit consumption on current CHD incidence may be the previous 5 years. Thus, we use the average (1-PIF) in the last five years. For cancers, it might take at least 10 years for any (notable) change in disease incidence to occur, and any benefit on disease incidence might last up to 30 years. Therefore, we would use the average (1-PIF) for 10 to 30 years ago. There is considerable uncertainty in these time lags. Therefore, we:  Specify the minimum and maximum time lags (e.g. 10 and 30 years for cancers)  And additionally make these parameters uncertain themselves (e.g. 20% SD normal distribution about minimum and maximum).
 And calculate the average (1-PIF) within this look-back time lag range.
We will include these parameters in actual publications, but in principle the following parameters (by disease) are the 'default'.

Diabetes: both a disease and a risk factor
The MSLT has key independence assumptions, including: 1. Risk factor distribution: the distributions of each risk factor can be treated as though independent of other risk factors. 2. Disease incidence rates: the incidence rate for a given disease (e.g. CHD) is independent of other diseases (e.g. the presence of diabetes). 3. Disease case-fatality and remission rates: the rates for a given disease (e.g. CHD) are independent of those for other diseases (e.g. diabetes).
The second assumption is the focus here, for diabetes. Diabetes is associated with increased rates of coronary heart disease and stroke (and some cancers), be it by shared common causes (i.e. confounding) or cause and effect (the concern here). Whether to address such 'dependency' depends on what one is doing with the model, through what risk factors. For the BODE 3 DIET MSLT model, interventions that change BMI and thence disease incidence are important. Figure 6 (page 39) gives the standard structure. BMI is independently associated with each of CHD and Diabetes Mellitus (DM), and change (∆) in the BMI distribution combined with the relative risk for the BMICHD and BMIDM association to give a PIF results in a change in both disease incidence rates. The change in mortality, morbidity and cost rates that result are then 'added' to the overall mortality, morbidity and cost rates in the main life-table.

Figure 6: Standard structure in MSLT for BMI as risk factors and CHD and DM states
A modelled intervention that lowers BMI may result in an overestimated QALYs if the reduction in diabetes and coronary heart disease 'double-count' the gains when considered independently. But if only the 'pure diabetes' mortality rate (e.g. based on the deaths coded as DM) is estimated in the DM state, and the higher than average population mortality rate otherwise (e.g. due to people with DM having higher CHD and stroke mortality) is not allowed for, the prevalence of DM will drift too high over time as the total mortality rate modelled for diabetics is not high enough. This over-estimated morbidity rate, in turn, may lead to an overestimate of morbidity gains due to a BMI lowering intervention. (And likewise an overestimate of costs savings as costs are a function of prevalence.) One solution to this dependency problem is a microsimulation model, where each individual's other disease status is 'known'. But for the BODE 3 DIET MSLT model, the partial solution we use is to restructure and re-parameterise the model.  Inputs to main lifetable the PIF link from the DM prevalence to CHD and stroke incidence. And the excess rate of other deaths due to DM will (partly at least) be implicitly captured through the changes in (say) BMI to cancer that includes some unquantified pathway through diabetes.
o given the uncertainty above in the default model, as a sensitivity analysis we model excess mortality among people with DM from having diabetes, excluding CHD and stroke mortality as that is quantified in, and outputted from, the CHD (and stroke) states instead of the DM-only case fatality rate above. This will probably overestimate the mortality due to DM, but does give an upper limit.
 But to 'allow' for the higher mortality rate among diabetics, a 'total excess' mortality rate (mort[all-cause|DM] -mort[all-cause], where the former is the all-cause mortality rate among diabetics, and the latter is the all-cause mortality rate in the general population without DM) is applied within the DM state as an absorbing state. This mortality is only used to 'kill people off' in the model to allow for dependent mortality risk; without this higher mortality rate taking people out of the alive DM population, the prevalence would drift too high (impacting on costs and morbidity).  The above structure ( Figure 7 and Figure 8, pages 41 and 42) and parametrization is an improvement for a disease like DM. However, it is not perfect.
The parameterisation of this modification to the MSLT requires recalculation of baseline or BAU parameters, and intervention parameters. Rather than present it either here (before such parameterisation has been described for the main model), we give a full description of how the above model alteration was specified in Appendix D: Parameterisation of 'DM as both a risk factor and disease' (page 89).

. Background population inputs
The following population parameters were included: 1) population size; 2) total prevalence years lived with disability (pYLDs); and 3) total mortality rates, all by 5-year age groups for each sex and ethnicity. Population counts were compiled using Statistics New Zealand 2011 estimates. Total pYLDs were calculated using the total (corrected for multiple morbidity) YLDs for all diseases in the NZBDS divided by the total population in New Zealand for each age, sex and ethnicity group. Population mortality rates were calculated using data from the Statistics New Zealand life tables for 2010-2012. Annual reductions in background population mortality were assumed to be 1.75% for non-Māori and 2.25% for Māori out to 2026 44 , then held constant.

Data sources, processing, DISMOD, and inputs to BODE 3 DIET MSLT model
The basic steps for generating disease inputs for the BODE 3 DIET MSLT model were: 1) data compilation; 2) preliminary processing of the data; and 3) DISMOD II estimation of epidemiologic parameters 45 .
Step 1: Data for these diseases were compiled from various sources (see Table 9).
Step 2: Some parameters were further processed to give 'best' (pre-DISMOD) estimates for 2011. For example, data on prevalence for less common diseases were compiled and then regression-smoothed prior to inputting into DISMOD II. Readers can refer to Appendix C: DISMOD II example for lung cancer (page 86) for a step by step description of data compilation and processing in DISMOD II for one example, lung cancer. (Similar documentation for all other diseases is available from the authors on request.) All parameters were generated by 5-year age groups by sex and ethnicity (Māori/Non-Māori), except breast, ovarian and endometrial cancers which were only compiled for women.
Step 3: These parameters were then inputted to DISMOD II, separately by sex and ethnicity, to generate a mathematically and 'epidemiologically consistent' set of parameters. For example, if the prevalence estimate was too low given what is known about incidence and case-fatality from the disease (and background 'competing' mortality), DISMOD II outputs values that are epidemiologically / mathematically consistent, allowing the user to 'weight' the inputs. For cancers, full weighting (setting at "100%") was given to incidence, as it was the most reliable parameter (due to New Zealand Cancer Registry data). Typically, mortality was also given full weighting and prevalence was given a 50% weighting (for disease-specific weighting information, README files for the disease of interest available upon request from the authors, and for lung cancer (only) in the Appendix C: DISMOD II example for lung cancer page 86). For DM, stroke and CHD, we additionally included time trends in incidence and case fatality inputs to DISMOD II, given the strong time trends in these diseases. The DISMOD output rates (in one year age groups) for incidence, prevalence, case-fatality and remission were then used to populate the BODE 3 DIET MSLT model for all diseases -except CHD, stroke, type 2 diabetes and osteoarthritis. For CHD, stroke, type 2 diabetes and osteoarthritis, only incidence, prevalence, and case-fatality were used (i.e. remission was assumed to be zero as these are usually life-long conditions). For specific details on final parameters for each disease, see Table 10 below.
Generating DRs by dividing pYLDs by prevalent cases for each 5-year age group, for each disease, for each sex by ethnicity, was often too unstable due to sparse data. We therefore aggregated age groupings to ensure the sum of prevalent cases exceeded 10 (e.g. 0-44 year olds were always combined; for common diseases such as CHD and stroke age groupings were: 0-44, 45-54, 55-64, 65-74, and 85+ years; for rare diseases such as pancreatic cancer in Māori males all age groups were combined).  CHD  DISMOD II  DISMOD II  DISMOD II  DISMOD II & NZBDS  Stroke  DISMOD II  DISMOD II  DISMOD II  DISMOD II & NZBDS  Type 2 diabetes  DISMOD II  DISMOD II  DISMOD II  DISMOD II & NZBDS  Osteoarthritis  DISMOD II  DISMOD II  DISMOD II  GBD DW  Breast cancer  DISMOD II  DISMOD II  DISMOD II  DISMOD II  DISMOD II & NZBDS  Colorectal cancer  DISMOD II  DISMOD II  DISMOD II  DISMOD II  DISMOD II & NZBDS  Endometrial cancer  DISMOD II  DISMOD II  DISMOD II  DISMOD II  DISMOD II & NZBDS  Gallbladder cancer  DISMOD II  DISMOD II  DISMOD II  DISMOD II  DISMOD II & NZBDS  Head & neck cancer  DISMOD II  DISMOD II  DISMOD II  DISMOD II  DISMOD II & NZBDS  Kidney cancer  DISMOD II  DISMOD II  DISMOD II  DISMOD II  DISMOD II & NZBDS  Liver cancer  DISMOD II  DISMOD II  DISMOD II  DISMOD II  DISMOD II &

Final processing of incidence and prevalence estimates
In an effort to more accurately reflect the disease epidemiology in the New Zealand population, some diseases incidence and prevalence rates were forced to be zero at young ages as a final step in processing. Specifically, the incidence and prevalence rates for all cancers were set to 0 for those 20 years and younger, for CHD they were set to 0 for 18 year olds and younger and for stroke they were set to 0 for 24 year olds and younger.

Future disease trends (incidence, remission and case-fatality)
The above parameterisation was for 2011 only. Some key parameters are known to have increasing or decreasing trends in recent decades -and are likely to have such trends in the near-future. Thus, we also specified future disease incidence and case-fatality as percentage annual change from 2011 to 2026. For CHD and stroke, we relied on NZBDS projections for annual changes in incidence and mortality (see Table 8 in a Report 47 ). Specifically, we incorporated an annual incidence change of -2% and an annual case-fatality trend of -2% for CHD and stroke.
For cancer trends, we relied on our previous modelling of future cancer incidence. 46  Uncertainty around the incidence, case-fatality and remission disease trends were included in the model for all diseases of 1 percentage point SD about the annual percentage change. This uncertainty draw is independent for each epidemiological parameter (i.e. incidence, case-fatality and remission) by disease, but correlated r=1.0 across each of the four sex by ethnic groupings and all diseases.

Disease health system cost inputs
Just as proportions of the cohort 'alive' in the overall and disease process are rewarded with additional QALYs for each annual cycle they live, so too can health system costs be 'rewarded'. In the BODE 3 DIET MSLT model, we have five types of health system cost:  Main life- Of note, health system costs will be updated in the future (as more years of data are accrued, and with 'improvements' to scale costs to more accurately reflect VOTE: Health), productivity costs (human capital approach) will be added in future models.

Section 3.03. Calibration
Calibration has been described as ensuring that "inputs and outputs are consistent with available data". 48,49 To a large extent, the BODE 3 DIET MSLT model is self-calibrating on inputs; the model uses total New Zealand population data for 2011, with some modification (usually slight) with DISMOD II to ensure epidemiological coherence.
As an additional calibration check, we compared the following rates for CHD, stroke and diabetes, in Figure 9 to Figure 14 (pages 49 to 54): -MSLT model input incidence, case fatality and prevalence -which are actually outputs from DISMOD II. -DISMOD mortality rates. They are neither inputs nor outputs for the MSLT, but are one of the rates used in DISMOD to develop the coherent set of epidemiological parameters -most notably the case fatality input rate. -MSLT model output prevalence and mortality rates. These differ from the DISMOD mortality and prevalence rates, as they are determined dynamically within the model as the cohorts (aged 2, 42 or 72 in 2011) age within the model.
The model check is that we expect the output prevalence and mortality rates to differ somewhatbut not too much -from the input prevalence and mortality rates given what we know about epidemiological trends and transitions. In brief, they appear to, and thus provide a form of calibration 'check' on the model.
In more detail, consider first the CHD rates in Figure 9 (page 49) and Figure 10 (page 50). The DISMOD and output mortality rates are virtually indistinguishable as the cohorts age. The output prevalence, however, is a bit lower. But this is coherent. The inputs are the rates in 2011. As CHD incidence is falling so rapidly, the prevalence as recorded in 2011 is higher than what it would have been if incidence had not been falling in the past. Put another way, for these graphs where 2011 rates are used as inputs with no future time trends, the prevalence rate is at 'equilibrium' for these inputs, whereas the prevalence as recorded in 2011 is not at equilibrium. Thus, we conclude the CHD rates are plausible and coherent.
Stroke rates are shown in Figure 11 (page 51) and Figure 12 (page 52). There is closer agreement than with CHD.
Diabetes rates are shown in Figure 13 (page 53) and Figure 14 (page 54). Here the pattern is the reverse of that for CHD, which is plausible and coherent as diabetes incidence rates have been increasing (and case fatality rates decreasing) that the observed prevalence rates by age in 2011 are less than the 'equilibrium' prevalence rates over time into the future outputted by the model.
As at younger ages it is a bit difficult to see differences in rates on an absolute scale, Appendix G: Model rates vs. DISMOD rates, log graphs from Section 3.04.5 (page 119) gives replicates of these calibration graphs with rates on a log scale. It is only with diabetes that a difference in prevalence rates between input and output series remains evident.

.04. Validation
The BODE 3 DIET MSLT model is a multi-application model, for studying preventive interventions. We attempted some validation given this broad remit. However, it was and is impossible to fully validate the model, both due to resource limitations and an absence of 'gold standard' data for many interventions (e.g. no randomized trials of saturated fat taxes through to disease incidence outcomes exist). Validation of the BODE 3 DIET MSLT model will continue alongside producing results (e.g. comparisons with overseas models), and new data will be forthcoming (e.g. disease incidence trends, intervention effect sizes). Thus, future improvements to the model are likely.
We organize this section using the headings from an International Society for Pharmacoeconomics and Outcomes Research (ISPOR) Good Practices in Modelling Task Force consensus paper: 48 face validity, verification (or internal validity), cross validity, external validity, and predictive validity.

"Face validity is the extent to which a model, its assumptions, and applications correspond to current science and evidence, as judged by people who have expertise in the problem." 48
The BODE 3 DIET MSLT model follows the form and structure of a MSLT, and more specifically the Assessing Cost Effectiveness (ACE) Prevention models 37,50-58 (including dietary and Physical Activity models) and the BODE 3 Tobacco model 59,60 . These models have been peer reviewed many times, lending face validity. -Other known risk factors (e.g. nuts and seeds) will likely be included in the future.
-As a macrosimulation model, it is difficult to allow for correlated risk factor distributions. That is, the BMI distribution is assumed independent of the fruit and vegetable intake distribution, across the population. Thus, the BODE 3 DIET MSLT model -unless modified or adapted -will be limited in answering questions around targeting of populations with multiple poor risk factors. The model also treats diseases as independent; no correlations in disease incidence for -say -CHD and lung cancer are allowed for. (Importantly -thoughwe do treat diabetes as both a risk factor and disease, allowing for the dependency of diabetes with CHD and stroke.) We have not formally subjected the BODE 3 DIET MSLT model to external face validity review prior to submission of publications for scientific peer review.

Verification (or Internal Validity) "Verification addresses whether the model's parts behave as intended and the model has been implemented correctly." 48
A regular process of verification was and is used in modifying and extending the BODE 3 DIET MSLT model, namely: -The following procedure was followed once model development was complete, checked and signed off by one of the Programme Directors: all model changes are undertaken by one team member, checked and signed off by a second team member, and signed off by one of the Programme Directors. This process accords with a Accountability for Quality Assurance process outlined by UK Department of Energy and Climate Change in their guidance for quality assurance of Excel-based models 61 , and is documented more fully in a BODE 3 quality assurance protocol (forthcoming). All model builds and extensions are 'logged' in a 'readme' tab in the model. -The following checks are implemented for a model version to be signed off: o A second team member -independently -randomly checking formulas and links in models o A second team member -independently -working through each process from beginning to end (e.g. risk factor A distribution, merged with risk factor A relative risks, to population impact fractions and their connection with disease incidence, then all-cause mortality, etc.). -A series of sensitivity analyses are undertaken to logic (stress) test the model. This covers both extreme values and a likely range of values to check how the model responds. For example, trends in disease incidence rates are turned off, and compared against expectation, the results of this checking is signed off by a Programme Director. For stress testing, selected input parameters are changed to extreme values (e.g. turning disease incidences to zero, one by one) to ensure changes in model outputs are consistent with expectation.
The above and other BODE 3 quality assurance processes are documented more fully elsewhere, 62 as well as specifically to the BODE 3 DIET MSLT model in its Readme tab.

"Cross-validation involves comparing a model with others and determining the extent to which they calculate similar results." 48
Model comparisons within the BODE 3 Programme have occurred, and are proposed with other international groups.
Within the BODE 3 programme, identical dietary salt reduction interventions were run through an early iteration of the BODE 3 DIET MSLT model and a CVD model built in TreeAge that had previously been developed by BODE 3 . 63,64 When an intervention of a decrease in sodium of 22.8mmol/day was run through both models, the overall QALYs gained were 110,000 in the TreeAge model and 103,000 in the DIET MSLT model (3% discounting). As there are a number of differences between the models generating results within 20% of each other was regarded as satisfactory, and the difference seen was closer to 5%. From our investigations it seems that the differences seen between the two models were due to a combination of different baseline incidence rates, baseline case-fatality rates and differing disability rates/weights between the two models. Model structure, definitions of stroke and effect size calculations don't appear to contribute very much to the differences seen.
Model comparisons are also underway with the Nuffield Department of Population Health, Oxford University (Adam Briggs, Peter Scarborough and colleagues) who are working on similar types of models with similar food taxes and subsidy interventions (e.g. [65][66][67] ). Model comparisons proposed include 'stripping back' to the same population demography and epidemiology to allow a head-tohead comparison of any differences in model structure, then sequential addition of varying population epidemiology (e.g. disease incidence rates, case-fatality and trends), and population demography (e.g. varying age structures).

External Validity
"In external validation, a model is used to simulate a real scenario, such as a clinical trial, and the predicted outcomes are compared with the real world ones." 48 Randomized trials through to disease incidence for the interventions proposed to be modelled with the BODE 3 DIET MSLT model are rare. We will consider the relevance of one of these for such validation work: a major sodium reduction trial on health outcomes, 68 but we note this might not prove to be informative given the decline in CVD incidence over the 20 years of this trial.
Meta-analyses of trials (where available) are used for parameterizing intervention effect sizes in the model (e.g. association of mHealth on weight loss 69 ).
'Natural experiments' -as they accrue (e.g. Danish food taxes 67,70 and Mexican SSB taxes 71 ) -will also provide comparison points.

"Predictive validity involves using a model to forecast events and, after sometime, comparing the forecasted outcomes with the actual ones." 48
It was not possible to compared forecast incidence and mortality rates in New Zealand for various interventions with model forecasts, as none of the interventions have been applied. However, it will be possible to compare BAU trends in disease incidence from the 2011 base-year out in due course.

Section 3.05. Model: Analysis
For each intervention, the model is run 2000 times using Monte Carlo simulation. Probabilistic uncertainty is included for intervention effect sizes (e.g. price elasticities, relative risks for the association between diet and disease incidence), intervention costs (e.g. cost of a new tax law) and selected baseline parameters (i.e. health system costs were assumed to have a gamma distribution with a SD of +/-10%).
We included uncertainty in the annual percentage changes in selected disease incidence trends (see above) for the diseases that made the largest contribution to the QALYs gained (and hence also cost savings).
Uncertainty around the starting estimates of incidence and case-fatality has been included in the model. Year 2011 starting estimates have been assigned a log-normal distribution, SD of +/-5%, with random draw in each iteration separately for incidence and case-fatality, by sex and age, but applied uniformly across ages (i.e. independent uncertainty by sex and age, but 100% correlated uncertainty by age within sex by ethnic groups).
All modelling is undertaken in Microsoft Excel®, using the add-in tool Ersatz (EpiGear, Version 1.3) for uncertainty analysis with R-software 'add-ons' for batch processing and output collation.

Section 3.06. A note on interchangeable use of 'DALYs averted' and 'QALYS gained'
Previous BODE 3 modelling 50 termed health gain as DALYs averted'. We use the terms 'DALYs averted' and 'QALYs gained' interchangeably. Why? Two reasons. First, The QALYs gained (or DALYs averted) in the MSLT modelling are not the same as DALYs calculated in a BDS. In the BDS they are (usually) calculated in one cross-sectional year, as a shortfall against an ideal standard (e.g. the best sexspecific life-table mortality rates in the world). In BODE 3 (and other related MSLT modelling, e.g. ACE-Prevention 50 ) the QALYs gained are the difference between the starting population's expectation of the remainder of their lives, and that under the intervention scenario. Second, the morbidity weights or 'health status valuations' (HSV) are pairwise comparisons conducted for the GBD 36 , and as such are one variant of HSV used in routine economic evaluations and QALY estimation. 72 These morbidity weights -given their derivation for the GBD -are called disability weights.
The disability weighting (DW) (in this case DRs, which in term stem from DWs applied in the BDS itself) assigned is just one variant of health status valuation (HSVs); QALYs use a variety of HSVs (e.g. those from EQ5D, etc.). Furthermore, DALYs in the BDS use an external or reference life-table (to generate a health gap or loss measure); in this multi-state life-table, the DALYs averted are at the incremental margin for the 2011 New Zealand population, the same concept and method as used for QALYs. The only conceptual difference between the QALYs we calculate and the various QALYs presented in much other research, is the HSV metric. In other cost-utility analyses the source of HSV is likely to vary between studies (arguably to fit the population's preference, but more usually due to the pragmatics of different questionnaires used) whereas our QALYs are derived from one very large and coherent set of disability weights calculated in the GBD 2010 from multi-country surveys. 36 We do not claim that the HSV in our QALY is 'better' than that used in other QALY estimates -there is genuine uncertainty in all HSVs.
The QALY metric captures health gain (assuming the intervention is beneficial) that arises from a mix of change in years of life and quality of each year of life. Usually a gain in QALYs (in prevention interventions at least) is due to a gain in life years lived (with or without change in quality of life). Note, however, that it is possible to achieve QALY gains with a reduction in life years lived (but very good improvements in quality of life), or with an increase in life years gained that is greater than any 'penalty' from living in lower quality of life.

2.
Step 2: Processing in DISMOD II software Below are examples of the weighting schemes used for lung cancer parameters ( Figure 16, page 90).

Figure 16: Example of parameter weighting in DISMOD II
Note that there was sometimes considerable instability in the case-fatality rates at younger ages. This is a function of sparse data, and the case-fatality rate needing to 'move' to reconcile with the incidence and mortality inputs (and to a lesser extent prevalence). Once inputted to the BODE 3 DIET MSLT model, it does however balance out to ensure a target mortality rate (which largely drives the health loss/gain). It must be noted that the VDR is getting 'better' or more comprehensive over time, meaning that if used to generate year-on-year incidence rates it will be spuriously high. (This may become less of a problem in future years once data systems and case definitions equilibrate.) However, it should be more accurate for prevalence, and if prevalence cases are also used to generate morbidity and costings, and mortality rates among this pool of prevalent cases, then there is coherence for these parameters excluding incidence.

(i) Incidence and prevalence rates
In principle: -The DM prevalence is just that observed on 31 December 2011 -The DM incidence is the new cases observed each year. But note above, we expect it will be spuriously high using VDR data up to 2014 at least. The decision was therefore made to ignore incidence in DISMOD.
Regression on the VDR linked with mortality and core population files were used to estimate annual prevalence (logistic model; main effects of sex, age (categorical in five year age groups), and ethnicity; and interactions of main effects), using the predicted values for 2011.
Due to the artificially high estimates of incidence this parameter was ignored in DISMOD; details are provided below.

(ii) Mortality rates
Diabetes is a difficult disease to model due to itself being a risk factor for other diseases, and therefore having mortality rates dependent with other diseases (e.g. it is no longer viable to assume independence of disease incidence and mortality when consider DM and CHD). It is important to keep in mind the BODE 3 DIET MSLT that is being parameterized, and its model structure. Namely: -DM is treated as a disease state just as any of the other states are (e.g. CHD, stroke, lung cancer). However, it is also a risk factor in and of itself for CHD and stroke, meaning that changes in DM prevalence are linked through PIFs to changes in CHD and stroke incidence. -A diagnosis of DM causes a non-ignorable increase in mortality for deaths coded with other than DM as the underlying cause of death. Some of this is causally due to DM, but some of it is due to confounding or correlated common causes (e.g. BMI as a risk factor for both DM and a range of cancer deaths and stroke and CHD).
o For the purposes of the DIET MSLT, we classify CHD and stroke as causally related. We assume this is captured by the above link of changing DM prevalence to changing CHD and stroke incidence (through a PIF) that then flows onto change in mortality from CHD and stroke, per se.
o DM-coded deaths -by definition -are causally due to DM. Changes in such DM-specific or DM-coded mortality in the DM state (due to changes in disease incidence from a given intervention) link to the main life-table, capturing mortality rate gains from interventions lowering DM incidence.
o The non-causally related deaths (i.e. non-CHD, non-stroke and non-DMcoded, or simply 'other') are not captured as an effect of the intervention, and therefore do not link through to the main lifetable. However, they still matter as far as determining the prevalence. That is, if we do not allow for higher 'other' competing mortality among diabetics, the future simulated prevalence will be too high, leading to overestimated morbidity and health system cost impacts of interventions.
To satisfy all these requirements, the MSLT needs the following mortality rates:

(iv) DISMOD
For any given disease, the following parameters are mathematically related: incidence, duration, prevalence, case fatality. Therefore, if estimates of (some of) these parameter are estimated, they may not be mathematically coherent as a system. In our example, the VDR case definition of who was a diabetic may have some (differential over time) misclassification bias, meaning that the incidence rates are (somewhat) biased.
DISMOD II is an epidemiological tool 45 that takes in sets of these parameters, and outputs a coherent set of the same input parameters (plus those from the above list for which input data was missing).
The input and output estimates should -of course -be close, acting as a check.
Treating diabetes as the disease of interest, we inputted the following parameters (for 2011): -Prevalence (see above) -Remission rate set at zero (i.e. assumption that once you have diabetes, you have diabetes forever) -Case fatality (see above) or Excess DM mortality to that in general population = Mx[all-cause|DM] -Mx[ _ | ̅̅̅̅̅ ] =Excess all-cause mortality rate among diabetics -Population mortality rate due to DM (see above) -And, as is required, the all-cause mortality rate in the general population.  . This case fatality generated the mortality difference (between the BAU and intervention) in the DM that was then 'added to' the all-cause mortality rate in the main lifetable. CHD and stroke deaths were excluded from the mortality rate linked to the main lifetable from the DM state, as this mortality was captured in the CHD and stroke disease processes (with the DM state acting as a risk factor to change incidence inflow to the CHD and stroke states).

Preventing double-counting of BMI effects on DM and on CHD and stroke
BMI is a risk factor for diabetes, CHD and stroke. However, diabetes is itself also a risk factor for CHD and stroke. This is illustrated for CHD in Figure 17 (page 96).
Although diseases in the BODE 3 DIET MSLT model are assumed to be independent, we added a link between changing diabetes prevalence and CHD and stroke incidence, using relative risks from systematic reviews of cohort studies by Peters et al 75,76 to quantify the increased risk of CHD and stroke in diabetics (Pathway B 2 in Figure 17: The relationship between BMI, diabetes and CHD, page 94) To then prevent double-counting of CHD and stroke effects we determined the relative risks of CHD and stroke associated with changes in BMI that would not be mediated by diabetes (Pathway A in Figure 17). Since there are no published estimates of these RRs, we derived them using the GRG nonlinear method of optimisation in Excel, assuming that the fractions of the disease attributable to BMI directly (Pathway A in Figure 17) and indirectly via diabetes (Pathway B in Figure 17) must sum to the total attributable fraction (Pathway C in Figure 17).    *The RRs used here were an average of the RRs in the GBD for cancers of the larynx, nasopharynx and other pharynx and mouth. *RRs were the same in the GBD paper published in 2016 77