The aim of the thesis is to provide a systematic framework to model the relationship between hospital volume and long-term survival in the oncologic field, and the practical goal to analyse this relationship for centres treating breast cancer in Lombardy, based on retrospective data. In the first part, the methodological issues were reviewed, starting with the definition of a causal-analysis framework and making explicit the assumptions that ensure identifiability. The criteria to choose the type of volume and how to calculate it were then revised. The hierarchical nature of this type of analysis, where subjects nested into hospitals tend to have correlated outcomes, requires regression methods to account for the cluster effect and this was discussed and examined. We concentrated on marginal (estimated with GEE or not) and conditional (mixed-effect) models. Those methods need also to be robust to cluster informativity, which means that the cluster size is not independent from the outcome, due to our primary hypothesis that hospital volume has an effect on the outcome and that hospital volume is highly correlated with cluster size. Finally, the need to analyse the volume-outcome relationship using flexible functions for the functional form of the volume was restated, as the effect of volume on outcome is not linear in many applications, and suitable splines were reviewed. On the basis of the discussed methods, a simulation study was then designed to mimick the structure of the data in our application setting and assess marginal and conditional Cox models’ performances. The motivating example consisted in a registry cohort of 18,938 patients with epithelial breast cancer, diagnosed between January 2014 and December 2016, resident and treated in five Lombardy Agencies for Health Protection (AHP). Cancer characteristics at diagnosis were retrieved from the AHP registries and information on treatment from hospitalization, outpatient and drug databases. Follow-up was investigated through census at December 2019. N=789 patients were excluded because of no registered treatment, 651 because treated outside the study area. Of 17,498 included patients, 95% were surgically treated, 38% received CT and 56% RT. Missing values in potential confounders (stage, tumour grade and educational qualification) were imputed using multiple imputation for clustered data through the MICE algorithm and PMM imputation technique. Number of hospitals were: 75 for surgery, 24 for radiotherapy, and 61 for chemotherapy. To determine which volume was most associated with survival, both previous-year and average previous 3-year volumes were evaluated, considering either specific (only for breast cancer) or overall (all breast surgeries or the total number of treated patients for chemo and radiotherapy). The association between volume and death was then estimated using a marginal Cox model with hospital as the cluster variable and including the following potential confounders: age, stage, morphology, comorbidity index, education level, tumour grade, emergency diagnosis. The AICs were used to compare models with linear and non-linear volume effects. For the latter, we explored graphical trends with the volume modelled as a penalized spline. The selected volume for the final analyses were: for surgery, the specific average volume of the previous 3 years, while for chemo and radiotherapy the specific average volume of the previous year. The 2-level (hospital) and 3-level (hospital nested in ATSs) random intercept Cox models were compared, using the likelihood ratio test, and the 2-level was always retained. To estimate the causal effect of volume on survival, we performed standardization after fitting the Cox model with all confounders, accounting for clustering, and compared survival curves for different volumes of interest, finding a very small protective effect of high treatment volume for surgery.

Gli obiettivi della tesi sono quello di fornire una cornice sistematica per modellare la relazione tra volume ospedaliero e sopravvivenza a lungo termine in ambito oncologico e di analizzare tale relazione per gli ospedali che hanno trattato il carcinoma mammario in Lombardia, utilizzando dati retrospettivi. Nella prima parte si esaminano le questioni di metodo, a partire dalla definizione di un approccio causale, rendendo espliciti gli assunti che garantiscono l'identificabilità. Sono stati poi rivisti i criteri per la scelta e la modalità di calcolo del volume. La natura gerarchica di questo tipo di analisi, in cui i soggetti annidati negli ospedali tendono ad avere esiti correlati, richiede metodi di regressione adeguati che devono anche essere robusti per l'informatività del cluster. Infatti la dimensione del cluster non è indipendente dall'esito in questo contesto, data la nostra ipotesi primaria che il volume dell'ospedale abbia un effetto sull'outcome e dato che il volume è altamente correlato con la dimensione del cluster. Questi aspetti sono stati esaminati concentrandosi sui modelli marginali (stimati o meno mediante GEE) e condizionati (a effetti misti). Infine, è stata ribadita la necessità di analizzare la relazione volume-esito utilizzando funzioni flessibili per modellare l’effetto del volume, poiché quest’ultimo non è lineare in molte applicazioni, e sono state esaminate funzioni spline adatte. È stato quindi progettato uno studio di simulazione per imitare la struttura dei dati nel nostro setting applicativo e valutare la bontà dei modelli di Cox marginali e condizionati. Il caso di studio è una coorte di 18.938 pazienti con carcinoma mammario, diagnosticato tra gennaio 2014 e dicembre 2016, residenti e trattate in 5 ATS della Lombardia. Le caratteristiche del tumore alla diagnosi derivano dai registri tumori delle ATS e le informazioni sul trattamento dai database ospedalieri, ambulatoriali e farmacologici. Il follow-up è stato indagato attraverso l’anagrafe degli assistiti al dicembre 2019. N=789 pazienti sono state escluse perché non trattate, 651 perché trattate al di fuori dell'area di studio. Delle 17.498 pazienti incluse, il 95% è stato trattato chirurgicamente, il 38% con chemioterapia e il 56% con radioterapia. I valori mancanti per alcuni potenziali fattori confondenti (stadio, grado del tumore e titolo di studio) sono stati imputati utilizzando metodi per i dati gerarchici attraverso l'algoritmo MICE e la tecnica d’imputazione PMM. Per determinare quale volume fosse maggiormente associato alla sopravvivenza, sono stati valutati sia i volumi dell'anno precedente che quelli medi dei 3 anni precedenti, considerando sia trattamenti specifici (solo per il carcinoma mammario) che complessivi (tutti gli interventi chirurgici al seno o il numero totale di pazienti trattate per chemio e radioterapia). L'associazione tra volume e esito è stata quindi stimata utilizzando un modello di Cox marginale con l'ospedale come variabile cluster e includendo i seguenti potenziali fattori confondenti: età, stadio, morfologia, indice di comorbilità, istruzione, grado, diagnosi in emergenza. Sono stati utilizzati gli AIC per confrontare modelli con effetto lineare e non del volume. Per quest'ultimi, è stato esplorato l’andamento grafico del volume modellato come una P-spline. Sono stati confrontati i modelli di Cox con intercetta casuale a 2 livelli (ospedale) e a 3 livelli (ospedale annidato in ATS), utilizzando il LRT test, ed è sempre stato scelto quello a 2 livelli. Per stimare l'effetto causale del volume sulla sopravvivenza, abbiamo eseguito la standardizzazione dopo aver adattato il modello di Cox, con tutti i fattori confondenti e tenendo conto del clustering, e confrontato le curve di sopravvivenza per diversi volumi di interesse, trovando un effetto protettivo molto piccolo dell’alto volume nel caso della chirurgia.

(2024). MODELLING THE RELATIONSHIP OF PROVIDER VOLUME WITH LONG TERM OUTCOME IN ONCOLOGIC PATIENTS WITH AN APPLICATION TO BREAST CANCER IN LOMBARDY. (Tesi di dottorato, Università degli Studi di Milano-Bicocca, 2024).

MODELLING THE RELATIONSHIP OF PROVIDER VOLUME WITH LONG TERM OUTCOME IN ONCOLOGIC PATIENTS WITH AN APPLICATION TO BREAST CANCER IN LOMBARDY

ANDREANO, ANITA
2024

Abstract

The aim of the thesis is to provide a systematic framework to model the relationship between hospital volume and long-term survival in the oncologic field, and the practical goal to analyse this relationship for centres treating breast cancer in Lombardy, based on retrospective data. In the first part, the methodological issues were reviewed, starting with the definition of a causal-analysis framework and making explicit the assumptions that ensure identifiability. The criteria to choose the type of volume and how to calculate it were then revised. The hierarchical nature of this type of analysis, where subjects nested into hospitals tend to have correlated outcomes, requires regression methods to account for the cluster effect and this was discussed and examined. We concentrated on marginal (estimated with GEE or not) and conditional (mixed-effect) models. Those methods need also to be robust to cluster informativity, which means that the cluster size is not independent from the outcome, due to our primary hypothesis that hospital volume has an effect on the outcome and that hospital volume is highly correlated with cluster size. Finally, the need to analyse the volume-outcome relationship using flexible functions for the functional form of the volume was restated, as the effect of volume on outcome is not linear in many applications, and suitable splines were reviewed. On the basis of the discussed methods, a simulation study was then designed to mimick the structure of the data in our application setting and assess marginal and conditional Cox models’ performances. The motivating example consisted in a registry cohort of 18,938 patients with epithelial breast cancer, diagnosed between January 2014 and December 2016, resident and treated in five Lombardy Agencies for Health Protection (AHP). Cancer characteristics at diagnosis were retrieved from the AHP registries and information on treatment from hospitalization, outpatient and drug databases. Follow-up was investigated through census at December 2019. N=789 patients were excluded because of no registered treatment, 651 because treated outside the study area. Of 17,498 included patients, 95% were surgically treated, 38% received CT and 56% RT. Missing values in potential confounders (stage, tumour grade and educational qualification) were imputed using multiple imputation for clustered data through the MICE algorithm and PMM imputation technique. Number of hospitals were: 75 for surgery, 24 for radiotherapy, and 61 for chemotherapy. To determine which volume was most associated with survival, both previous-year and average previous 3-year volumes were evaluated, considering either specific (only for breast cancer) or overall (all breast surgeries or the total number of treated patients for chemo and radiotherapy). The association between volume and death was then estimated using a marginal Cox model with hospital as the cluster variable and including the following potential confounders: age, stage, morphology, comorbidity index, education level, tumour grade, emergency diagnosis. The AICs were used to compare models with linear and non-linear volume effects. For the latter, we explored graphical trends with the volume modelled as a penalized spline. The selected volume for the final analyses were: for surgery, the specific average volume of the previous 3 years, while for chemo and radiotherapy the specific average volume of the previous year. The 2-level (hospital) and 3-level (hospital nested in ATSs) random intercept Cox models were compared, using the likelihood ratio test, and the 2-level was always retained. To estimate the causal effect of volume on survival, we performed standardization after fitting the Cox model with all confounders, accounting for clustering, and compared survival curves for different volumes of interest, finding a very small protective effect of high treatment volume for surgery.
VALSECCHI, MARIA GRAZIA
Tumore mammario; Volume-esito; Modelli misti; Modelli marginali; Imputazioni multiple
Breast cancer; Volume-outcome; Mixed models; Modelli marginali; Multiple imputation
MED/01 - STATISTICA MEDICA
English
27-feb-2024
35
2022/2023
embargoed_20270227
(2024). MODELLING THE RELATIONSHIP OF PROVIDER VOLUME WITH LONG TERM OUTCOME IN ONCOLOGIC PATIENTS WITH AN APPLICATION TO BREAST CANCER IN LOMBARDY. (Tesi di dottorato, Università degli Studi di Milano-Bicocca, 2024).
File in questo prodotto:
File Dimensione Formato  
phd_unimib_027654.pdf

embargo fino al 27/02/2027

Descrizione: MODELLING THE RELATIONSHIP OF PROVIDER VOLUME WITH LONG TERM OUTCOME IN ONCOLOGIC PATIENTS WITH AN APPLICATION TO BREAST CANCER IN LOMBARDY
Tipologia di allegato: Doctoral thesis
Dimensione 2.86 MB
Formato Adobe PDF
2.86 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/465021
Citazioni
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
Social impact