Partek Flow Documentation

Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents
maxLevel2
minLevel2
excludeAdditional Assistance

Introducing

...

Cox regression (Cox proportional-hazards model) tests the effects of several factors (predictors) on survival time. Predictors that lower the probability of survival at a given time are called risk factors; predictors that increase the probability of survival at a given time are called protective factors. The Cox proportional-hazards model are similar to a multiple logistic regression that considers time-to-event rather than simply whether an event occurred or not. 

In this tutorial, we will use Cox Regression to test the effects of tumor gene expression on survival time while accounting for tumor size.   

Performing Cox Regression Analysis

To begin, you should have the Survival Tutorial data set open in Partek Genomics Suite as shown.

...

Numbered figure captions
SubtitleTextInvoking Cox Regression
AnchorNameInvoking Cox Regression

Image Removed

The Cox Regression dialog will open. Please note that in this tutorial data set, column 1. Survival (years) indicates the survival time of each patient in years and column 2. Event indicates the event status for each patient, death or censored. 

  • Set Time Variable to 1. Survival (years) using the drop-down menu
  • Set Event Variable to 2. Event using the drop-down menu

Only numeric data are displayed in the Time Variable drop-down list and only categorical data with two categories are displayed in Event Variable

  • Set Event Status to death using the drop-down menu (Figure 2)

Event Status should be set to the primary event outcome. All Response Variables will be automatically selected for Predictor. This means that Cox Regression will test every probe set for association with the survival (time-to-event). 

Numbered figure captions
SubtitleTextConfiguring the Cox Regression dialog
AnchorNameConfiguring the Cox Regression dialog

Image Removed

Co-predictors are numeric or categorical factors that will be included in the regression model. To evaluate the association between tumor size and gene expression, we can add tumor size to the co-predictors list.

  • Select 7. tumor size (mm) from the Candidate(s) panel
  • Select Add Factor > to add it to the Co-predictor(s) panel

Advanced options such as the inclusion of interactions between predictors and co-predictors can be accessed by selecting Model... (Figure 3) and the Results... button invokes a dialog (Figure 4) with additional output options for the results spreadsheet. We do not need to adjust any of the advanced model or output options for this tutorial.  

Numbered figure captions
SubtitleTextConfiguring advanced options for Cox Regression
AnchorNameModel dialog

Image Removed

Numbered figure captions
SubtitleTextConfiguring output options for Cox Regression
AnchorNameResults...

Image Removed

...

Numbered figure captions
SubtitleTextConfiguring Cox Regression to assess the effect of gene expression and tumor size on survival
AnchorNameConfigured Cox Regression

Image Removed

The spreadsheet generated by Cox Regression (Figure 6) includes a row for each probe set; the columns provide the following information: 

1. Column # - Column number of probe set in probe intensities spreadsheet

2. Probest ID - ID of probe set in probe intensities spreadsheet

3. HRatio(gene) - Hazard ratio for the probe set 

4. LowCI(gene) - lower 95% confidence boundary of the hazard ratio for the probe set

5. UpCI(gene) - upper 95% confidence boundary of the hazard ratio for the probe set

6. p-value(gene) - P-value of the corresponding Chi-squared test. A low value indicates that the predictor (probe set) poses a large hazard or is associated with shortened surivival time

7. to 10. - Effects of the co-predictor on survival time; for each co-predictor, a similar set of columns is added

11. modelfit(0) - P-value of the test assessing the overall model fit, i.e., the relationship between survival time, the predictor, and co-predictors in the model. A modelfit value of > 0.05 indicates a low association between the predictor and/or co-predictors with survival time. 

Please note that the Cox Regression results spreadsheet is a temporary file. If you would like to be able to view the spreadsheet again after closing Partek Genomics Suite, be sure to save it by selecting the Save Active Spreadsheet icon (Image Removed).

Numbered figure captions
SubtitleTextCox Regression results spreadsheet
AnchorNameCox Regression Spreadsheet

Image Removed

Survival Analysis

Survival analysis is a branch of statistics that deals with modeling of time-to-event. In the context of “survival,” the most common event studied is death; however, any other important biological event could be analyzed in a similar fashion (e.g., spreading of the primary tumor or occurrence/relapse of disease). The significant event should be well-defined and occur at a specific time.

As the primary outcome event is typically unfavorable (e.g., death, metastasis, relapse, etc.), the event is called a “hazard.” Survival analysis tries to answer questions such as: What is the proportion of a population who will survive past a certain time (i.e., what is the 5-year survival rate)? What is the rate at which the event occurs? Do particular characteristics have an impact on survival rates (e.g., are certain genes associated with survival)? Is the 5-year survival rate improved in patients treated by a new drug? 

The hazard ratio is an effect size measure used to assess the direction and magnitude of the effect of a predictor variable on the relative likelihood of the event occurring at any given point in time, controlling for other predictors in the model.For  For continuous predictors, such as gene expression values and tumor size, the hazard ratio is the predicted change in the hazard for a unit increase in the predictor. A hazard ratio greater than 1 indicates that the predictor is associated with shorter time-to-event, hazard ratio less than 1 indicates that the predictor is associated with greater time-to-event, and a hazard ratio of 1 indicates that the predictor has no effect on time-to-event. For categorical predictors, the hazard ratio is relative to the reference category. 

For any probe set, we can view a detailed HTML report. 

...

Numbered figure captions
SubtitleTextInvoking an HTML report for a probe set
AnchorNameInvoking HTML report

Image Removed

The HTML report (Figure 8) will open in your default web browser. 

...

SubtitleTextCox Regression HTML report
AnchorNameHTML Report

...

An important feature of survival analysis is the presence of “censored” data. Censored data refers to subjects that have not experienced the event being studied. For example, medical studies often focus on survival of patients after treatment so the survival times are recorded during the study period. At the end of the study period, some patients are dead, some patients are alive, and the status of some patients is unknown because they dropped out of the study. Censored data refers to the latter two groups. The patients who survived until the end of the study or those who dropped out of the study have not experienced the study event "death" and are listed as "censored". 

Introducing Cox Regression

Cox regression (Cox proportional-hazards model) tests the effects of factors (predictors) on survival time. Predictors that lower the probability of survival at a given time are called risk factors; predictors that increase the probability of survival at a given time are called protective factors. The Cox proportional-hazards model are similar to a multiple logistic regression that considers time-to-event rather than simply whether an event occurred or not. Cox regression should not be used for a small sample size (n < 40) because the events could accidently concentrate into one of the cohorts leading to an infinite hazard ratio which will not produce meaningful results (Xu R, Shaw PA, Mehrotra DV. Hazard ratio estimation in small samples. Stat Biopharm Res. https://shawstat.org/wp-content/uploads/2021/03/Hazard-ratio-estimation-in-small-samples.pdf)

Configuring the Cox Regression Dialogue 

  • Open the Cox Regression task in the task menu under Statistics

         Image Added

  • Next, select the Time, Event, and Event status using the drop-down window. Partek Flow will automatically guess factors that are appropriate for these options.  

         Image Added

  • The predictors and co-predictors (factors) in the model must be defined. Co-predictors are numeric or categorical factors that will be included in the regression model. Time-to-event will be performed on features (e.g. genes) by default. Select a factor that is not features to model a different variable which will disable each feature as a factor. Next, choose the factor of interest from the drop-down menu. Add factors will add factors to the model that act to explain the relationship for time-to-event (co-factors) in addition to features. Choose Add interaction to add co-predictors with known dependencies. If factors are added here, they cannot be added as stratification factors.
  • Next, the user can define comparisons. Configure contrasts by moving factors into the numerator (e.g. experimental factor) or denominator (e.g. control factor / reference), choose Combine or Pairwise, and add the comparison which will be displayed below. Combine all numerator levels and combine all denominator levels in a single comparison or Pairwise to split all numerator levels and split all denominator levels into a factorial set of comparisons meaning every numerator will be paired with every denominator. Multiple comparisons from different factors can be added. 
  • Select categorical factors to perform stratification. Stratification is used when proportional hazard assumptions are violated or not constant over time with co-predictors. Stratified Cox regression accounts for non-proportional hazards over time by optimizing hazard strata then fitting the stratified Cox regression model. In other words, the data is split into subgroups based on the categorical variable and the model is re-estimated. This accounts for the effect of a co-predictor that varies over time. 
  • The results of Cox regression analysis provide key information to interpret, including:
    • Hazard ratio (HR): if the HR = 0.5 then half as many patients are experiencing the event compared to the control group, if the HR = 1 the event rates are the same in both groups, and if the HR = 2 then twice as many are experiencing an event compared to the control group. 
    • HR limit
    • P-value: the lower the p-value, the greater the significance of the observation. 



Additional assistance


Rate Macro
allowUsersfalse

...