Package 'PHInfiniteEstimates' reference manual

Title:	Tools for Inference in the Presence of a Monotone Likelihood
Description:	Proportional hazards estimation in the presence of a partially monotone likelihood has difficulties, in that finite estimators do not exist. These difficulties are related to those arising from logistic and multinomial regression. References for methods are given in the separate function documents. Supported by grant NSF DMS 1712839.
Authors:	John E. Kolassa and Juan Zhang
Maintainer:	John E. Kolassa <[email protected]>
License:	GPL-3
Version:	2.9.5
Built:	2025-03-30 05:56:49 UTC
Source:	https://github.com/cran/PHInfiniteEstimates

Calculate the Aalen-Johansen (1978) estimate in the Competing risk context. See Aalen, Odd O., and Søren Johansen. "An Empirical Transition Matrix for Non-Homogeneous Markov Chains Based on Censored Observations." Scandinavian Journal of Statistics 5, no. 3 (1978): 141-50. Accessed January 15, 2021. http://www.jstor.org/stable/4615704.

Description

Calculate the Aalen-Johansen (1978) estimate in the Competing risk context. See Aalen, Odd O., and Søren Johansen. "An Empirical Transition Matrix for Non-Homogeneous Markov Chains Based on Censored Observations." Scandinavian Journal of Statistics 5, no. 3 (1978): 141-50. Accessed January 15, 2021. http://www.jstor.org/stable/4615704.

Usage

aalenjohansen(times, causes)
aalenjohansen(times, causes)

Arguments

`times`	Event times.
`causes`	Causes, with 0 coded as censored, 1 as cause of interest, other for competing.

Value

a list with components

times Unique times
surv Aalen-Johansen estimator for cause 1.

Plot hazards for two strata for each time. At times with an event in one but not the other group, the fitted hazard remains constant, and so the plot is a step function. If hazards are proportional between strata, then the plot should be close to a straight line.

Description

Plot hazards for two strata for each time. At times with an event in one but not the other group, the fitted hazard remains constant, and so the plot is a step function. If hazards are proportional between strata, then the plot should be close to a straight line.

Usage

andersenplot(fit)
andersenplot(fit)

Arguments

fit

A coxph fit with a stratification term.

Examine the potential role of treatment in treatment in a model already including sex. Straight lines that are not 45 degrees indicate the appropriateness of new variable as a linear effect.

Description

Examine the potential role of treatment in treatment in a model already including sex. Straight lines that are not 45 degrees indicate the appropriateness of new variable as a linear effect.

Usage

arjasplot(formulastring, time, stratifier, status, mydata)
arjasplot(formulastring, time, stratifier, status, mydata)

Arguments

`formulastring`	A formula for a coxph fit.
`time`	The name of the time variable
`stratifier`	The name of the stratifier variable
`status`	The name of the status variable
`mydata`	The data frame.

Newton Raphson Fitter for partial likelihood

Description

This function implements the approximate conditional inferential approach of Kolassa and Zhang (2019) to proportional hazards regression.

Usage

bestbeta(fit, exclude = NULL, start = NULL, touse = NA, usecc = FALSE)
bestbeta(fit, exclude = NULL, start = NULL, touse = NA, usecc = FALSE)

Arguments

`fit`	Output from a Cox PH regression, with x=TRUE and y=TRUE
`exclude`	data set with stratum and patient number to exclude.
`start`	Starting value
`touse`	columns of the design matrix to use.
`usecc`	Logical variable indicating whether to use a continuity correction, or nuerical variable representing teh continuity correction.

Value

Fitted survival analysis regression parameter of class coxph

References

Kolassa JE, Zhang J (2019). https://higherlogicdownload.s3.amazonaws.com/AMSTAT/fa4dd52c-8429-41d0-abdf-0011047bfa19/UploadedImages/NCB_Conference/Presentations/2019/kolassa_toxslides.pdf. Accessed: 2019-07-14.

Examples

bfit<-coxph(Surv(TIME,CENS)~T+N+CD,data=breast,x=TRUE)
noccfit<-bestbeta(bfit)
bestbeta(bfit,usecc=TRUE,start=noccfit$start)
bfit<-coxph(Surv(TIME,CENS)~T+N+CD,data=breast,x=TRUE)
noccfit<-bestbeta(bfit)
bestbeta(bfit,usecc=TRUE,start=noccfit$start)

Check how censoring impacts sampling properties of KM fit and log rank test.

Description

Check how censoring impacts sampling properties of KM fit and log rank test.

Usage

checkcensor(nsamp = 1000, nobs = 1000)
checkcensor(nsamp = 1000, nobs = 1000)

Arguments

`nsamp`	Number of MC samples
`nobs`	Number of observations

Value

biases of fits.

Produce a graphical assessment of Monte Carlo experiment on fidelity of proportional hazards regression to the uniform ideal.

Description

This function draws a quantile plot for Monte Carlo assessments of fit to the corrected proportional hazards fit.

Usage

checkresults(regnsimulation, frac = 0.1)
checkresults(regnsimulation, frac = 0.1)

Arguments

`regnsimulation`	A structure with a component out, matrix with columns representing definitions of p-values and as many rows as there MC samples.
`frac`	Proportion for bottom of distribution to be assessed.

Value

A list with components of consisting of simulated Wald p-values, likelihood ratio p-values, and corrected likelihood ratio p-values.

Plot resuts of simcode

Description

Plot resuts of simcode

Usage

compareplot(simresults)
compareplot(simresults)

Arguments

simresults

the result of simcode

Value

nothing.

Simulate from a competing risk model with correlated log normal errors, and plot various estimates.

Description

Simulate from a competing risk model with correlated log normal errors, and plot various estimates.

Usage

compete.simulation(ncr = 4, sig = 0.8, ns = 1000)
compete.simulation(ncr = 4, sig = 0.8, ns = 1000)

Arguments

`ncr`	Number of competing risks.
`sig`	correlation among competing risks.
`ns`	number of observations.

Convert a baseline logit model data set, formatted in the long form as described in the documentation for mlogit.data from mlogit package, to a conditional logistic regression.

Description

Convert a baseline logit model data set, formatted in the long form as described in the documentation for mlogit.data from mlogit package, to a conditional logistic regression.

Usage

convertbaselineltolr(dataset, choice, covs, strs = "chid", alt = "alt")
convertbaselineltolr(dataset, choice, covs, strs = "chid", alt = "alt")

Arguments

`dataset`	in formatted as in the output from mlogit.data of the mlogit packages
`choice`	name of variable in dataset representing choice, a logical variable indicating whether this choice is actually chosen.
`covs`	vector of names of covariates
`strs`	name of variable in data set indicating independent subject
`alt`	name of variable in data set indicating potential choice.

Details

This function implements version of (Kolassa 2016). The multinomial regression is converted to a conditional logistic regression, and methods of (Kolassa 1997) may be applied. This function differs from convertmtol of this package in that convertmtol treats a less-rich data structure, and this function treats the richer data structure that is an output of mlogit.data from package mlogit. Data in the example is from Sanders et al. (2007).

Value

a data set on which to apply conditional logistic regression, corresponding to the baseline logit model.

References

Sanders DJ, Whiteley PF, Clarke HD, Stewart M, Winters K (2007). “The British Election Study.” https://www.britishelectionstudy.com.

Kolassa JE (1997). “Infinite Parameter Estimates in Logistic Regression.” Scandinavian Journal of Statistics, 24, 523–530. doi:10.1111/1467-9469.00078.

Kolassa JE (2016). “Inference in the Presence of Likelihood Monotonicity for Polytomous and Logistic Regression.” Advances in Pure Mathematics, 6, 331-341. doi:10.4236/apm.2016.65024.

Examples

data(voter.ml)
covs<-c("Labor","Liberal.Democrat","education")
#Fit the multinomial regression model, for comparison purposes.
## Lines beginning ## give mlogit syntax that has been made obsolete.
#Add the index attribute to the data set, giving the index of choice made and the index of the 
#alternative, and a boolean variable giving choice.
##attributes(voter.ml)$index<-voter.ml[,c("chid","alt")]
##attributes(voter.ml)$choice<-"voter"
##mlogit(voter~1|Labor+Liberal.Democrat+education,data=voter.ml)
# The package mlogit is scheduled for archiving.  If it is available, the
# next two lines fit the model using mlogit.
# mlogit(voter~1|Labor+Liberal.Democrat+education,data=voter.ml, 
#    chid.var = "chid", alt.var = "alt")
#Convert to a data set allowing treatment as the equivalent conditional logistic regression.  
#This result will be processed using reduceLR of this package to give an equivalent conditional
# regression model avoiding infinite estimates.
out<-convertbaselineltolr(voter.ml,"voter",c("Labor","Liberal.Democrat","education"))
#Fit the associated unconditional logistic regression for comparison purposes.
glm(out[,"y"]~out[,1:75],family=binomial)
data(voter.ml)
covs<-c("Labor","Liberal.Democrat","education")
#Fit the multinomial regression model, for comparison purposes.
## Lines beginning ## give mlogit syntax that has been made obsolete.
#Add the index attribute to the data set, giving the index of choice made and the index of the 
#alternative, and a boolean variable giving choice.
##attributes(voter.ml)$index<-voter.ml[,c("chid","alt")]
##attributes(voter.ml)$choice<-"voter"
##mlogit(voter~1|Labor+Liberal.Democrat+education,data=voter.ml)
# The package mlogit is scheduled for archiving.  If it is available, the
# next two lines fit the model using mlogit.
# mlogit(voter~1|Labor+Liberal.Democrat+education,data=voter.ml, 
#    chid.var = "chid", alt.var = "alt")
#Convert to a data set allowing treatment as the equivalent conditional logistic regression.  
#This result will be processed using reduceLR of this package to give an equivalent conditional
# regression model avoiding infinite estimates.
out<-convertbaselineltolr(voter.ml,"voter",c("Labor","Liberal.Democrat","education"))
#Fit the associated unconditional logistic regression for comparison purposes.
glm(out[,"y"]~out[,1:75],family=binomial)

Convert a polytomous regression to a conditional logistic regression.

Description

Convert a polytomous regression to a conditional logistic regression.

Usage

convertmtol(xmat, str, yvec, subjects)
convertmtol(xmat, str, yvec, subjects)

Arguments

`xmat`	regression matrix
`str`	stratum label
`yvec`	vector of responses
`subjects`	vector of subject labels passed directly to the output.

Details

Implements version of (Kolassa 2016). The multinomial regression is converted to a conditional logistic regression, and methods of (Kolassa 1997) may be applied. This function differs from convertbaselineltolr of this package in that the former treats the richer data structure of package mlogit, and this function treats a less complicated structure. Data in the example is the breast cancer data set breast of package coxphf.

Value

a data set on which to apply conditional logistic regression, corresponding to the multinomial regression model.

References

Kolassa JE (1997). “Infinite Parameter Estimates in Logistic Regression.” Scandinavian Journal of Statistics, 24, 523–530. doi:10.1111/1467-9469.00078.

Kolassa JE (2016). “Inference in the Presence of Likelihood Monotonicity for Polytomous and Logistic Regression.” Advances in Pure Mathematics, 6, 331-341. doi:10.4236/apm.2016.65024.

Examples

#Uses data set breast from package coxphf.
data(breast)
out<-convertstoml(Surv(breast$TIME,breast$CENS),breast[,c("T","N","G","CD")])
out1<-convertmtol(out[,c("T","N","G","CD")],out[,"chid"],out[,"choice"],
   out[,"patients"])
glmout<-glm.fit(out1$xmat,out1$y,family=binomial())
#In many practice examples, the following line shows which observations to retain
#in the logistic regression example.
moderate<-(fitted(glmout)<1-1.0e-8)&(fitted(glmout)>1.0e-8)
# Proportional hazards fit illustrating infinite estimates.
coxph(Surv(TIME,CENS)~ T+ N+ G+ CD,data=breast)
# Wrong analysis naively removing covariate with infinite estimate
coxph(Surv(TIME,CENS)~ T+ N+ CD,data=breast)
summary(glm((CENS>22)~T+N+G+CD,family=binomial,data=breast))

out2<-reduceLR(out1$xmat,yvec=out1$y,keep="CD")
bestcoxout<-coxph(Surv(TIME,CENS)~ T+ N+ G+ CD,data=breast,
   subset=as.numeric(unique(out1$subjects[out2$moderate])))

#Uses data set breast from package coxphf.
data(breast)
out<-convertstoml(Surv(breast$TIME,breast$CENS),breast[,c("T","N","G","CD")])
out1<-convertmtol(out[,c("T","N","G","CD")],out[,"chid"],out[,"choice"],
   out[,"patients"])
glmout<-glm.fit(out1$xmat,out1$y,family=binomial())
#In many practice examples, the following line shows which observations to retain
#in the logistic regression example.
moderate<-(fitted(glmout)<1-1.0e-8)&(fitted(glmout)>1.0e-8)
# Proportional hazards fit illustrating infinite estimates.
coxph(Surv(TIME,CENS)~ T+ N+ G+ CD,data=breast)
# Wrong analysis naively removing covariate with infinite estimate
coxph(Surv(TIME,CENS)~ T+ N+ CD,data=breast)
summary(glm((CENS>22)~T+N+G+CD,family=binomial,data=breast))

out2<-reduceLR(out1$xmat,yvec=out1$y,keep="CD")
bestcoxout<-coxph(Surv(TIME,CENS)~ T+ N+ G+ CD,data=breast,
   subset=as.numeric(unique(out1$subjects[out2$moderate])))

Convert a proportional hazards regression to a multinomial regression.

Description

Convert a proportional hazards regression to a multinomial regression.

Usage

convertstoml(survobj, covmat)
convertstoml(survobj, covmat)

Arguments

`survobj`	A survival object, with potentially right censoring.
`covmat`	a matrix of covariates.

Details

Implements version of (Kolassa and Zhang 2019). The proportional hazards regression is converted to a multinomial regression logistic regression, and methods of (Kolassa 2016) may be applied. This function is intended to produce intermediate results to be passed to convertmtol, and then to reduceLR of (Kolassa 1997). See examples in the convertmtol documentation.

Value

a data set on which to apply conditional multinomial regression, corresponding to the proportional hazards regression analysis. In order to run the line commented out below, you would need this: # @importFrom mlogit mlogit.data

References

Kolassa JE (1997). “Infinite Parameter Estimates in Logistic Regression.” Scandinavian Journal of Statistics, 24, 523–530. doi:10.1111/1467-9469.00078.

Kolassa JE (2016). “Inference in the Presence of Likelihood Monotonicity for Polytomous and Logistic Regression.” Advances in Pure Mathematics, 6, 331-341. doi:10.4236/apm.2016.65024.

Draw diagram for toy PH example.

Description

Draw diagram for toy PH example.

Usage

drawdiagram()
drawdiagram()

Value

nothing.

Remove observations from a proportional hazards regression, and return the fit of the reduced model.

Description

This function implements the approximate conditional inferential approach of Kolassa and Zhang (2019) to proportional hazards regression.

Usage

fixcoxph(randdat, xxx, iv, verbose = FALSE)
fixcoxph(randdat, xxx, iv, verbose = FALSE)

Arguments

`randdat`	A list with at least the component y, representing the Surv() object. I expect that this will be output from an initial non-convergent regression.
`xxx`	a design matrix for the regression. I expect that this will be the $x component of the output from an initial non-convergent regression, run with x=TRUE .
`iv`	name of the variable of interest, as a character string
`verbose`	logical flag governing printing.

Value

Fitted survival analysis regression parameter of class coxph, fitted form data set with observations forcing infinite estimation removed.

References

Examples

data(breast) # From library coxphf
bcfit<-coxph(Surv(TIME,CENS)~ T+ N+ G+ CD,data=breast,x=TRUE)

fixcoxph(bcfit,bcfit$x,"T",Surv(TIME,CENS)~ T+ N+ G+ CD)

testdat2 <- data.frame(Time=c(4,3,1,1,2,2,3),
  Cen=c(1,1,1,0,0,0,0), Primary=c(0,2,1,1,1,0,0), Sex=c(0,0,0,0,1,1,1))
(bcfit<-coxph(Surv(Time,Cen)~Primary + Sex, testdat2, x=TRUE, ties="breslow"))
fixcoxph(bcfit,bcfit$x,"Primary")
data(breast) # From library coxphf
bcfit<-coxph(Surv(TIME,CENS)~ T+ N+ G+ CD,data=breast,x=TRUE)

fixcoxph(bcfit,bcfit$x,"T",Surv(TIME,CENS)~ T+ N+ G+ CD)

testdat2 <- data.frame(Time=c(4,3,1,1,2,2,3),
  Cen=c(1,1,1,0,0,0,0), Primary=c(0,2,1,1,1,0,0), Sex=c(0,0,0,0,1,1,1))
(bcfit<-coxph(Surv(Time,Cen)~Primary + Sex, testdat2, x=TRUE, ties="breslow"))
fixcoxph(bcfit,bcfit$x,"Primary")

Perform Gehan's application to the Wilcoxon test for multiple samples, testing for equivalance of survival curve. See Klein and Moeschberger (1997) Survival Analysis (7.3.3) and pp. 193-194.

Description

Perform Gehan's application to the Wilcoxon test for multiple samples, testing for equivalance of survival curve. See Klein and Moeschberger (1997) Survival Analysis (7.3.3) and pp. 193-194.

Usage

gehan.wilcoxon.test(
  myformula,
  data,
  gehan = TRUE,
  plot = FALSE,
  alpha = 0.05,
  subset = NULL
)
gehan.wilcoxon.test(
  myformula,
  data,
  gehan = TRUE,
  plot = FALSE,
  alpha = 0.05,
  subset = NULL
)

Arguments

`myformula`	Proportional hazards formula appropriate for survfit
`data`	the data set
`gehan`	logical flag triggering the Wilcoxon test (gehan=TRUE), with weights equal to total at risk, or the log rank test (gehan=FALSE) with weights all 1.
`plot`	logical flag triggering plotting.
`alpha`	Nominal test level for plotting on graph
`subset`	Apply to a subset of the data.

Value

An htest-like object with the chi-square version of the test.

Examples

data(breast)#From package coxphf
gehan.wilcoxon.test(Surv(TIME,CENS)~G,data=breast)
data(breast)#From package coxphf
gehan.wilcoxon.test(Surv(TIME,CENS)~G,data=breast)

Simulate operating characteristics of repaired Cox regression and competitors.

Description

This function is intended to verify the operating characteristics of the approximate conditional inferential approach of Kolassa and Zhang (2019) to proportional hazards regression. An exponential regression model, corresponding to the proportional hazards regression model, is fit to the data, and new data sets are simulated from this model. P-values are calculated for these new data sets, and their empirical distribution is compared to the theoretical uniform distribution.

Usage

heinzeschemper(
  nobs = 50,
  k = 5,
  B = 1,
  c = 0,
  nsamp = 1000,
  beta = NULL,
  add = NULL,
  half = NULL,
  verbose = FALSE,
  smoothfirst = FALSE
)
heinzeschemper(
  nobs = 50,
  k = 5,
  B = 1,
  c = 0,
  nsamp = 1000,
  beta = NULL,
  add = NULL,
  half = NULL,
  verbose = FALSE,
  smoothfirst = FALSE
)

Arguments

`nobs`	number of observations in simulated data set.
`k`	number of covariates in simulated data set. Each covariate is dochotomous.
`B`	odds of 1 vs. 0 in dichotomous variables.
`c`	censoring proportion.
`nsamp`	number of samples.
`beta`	regression parameters, all zeros if null, and all the same value if a scalar.
`add`	partial simulation results to be added to, or NULL if de novo.
`half`	does nothing; provided for compatabilitity with simcode.
`verbose`	Triggers verbose messages.
`smoothfirst`	Triggers normal rather than dichotomous interest covariate.

Value

a list with components

out matrix with columns corresponding to p-values.

Perform inference on conditional sample space.

Description

This function performs classical frequentist statistical inference to a discrete multivariate canonical exponential family. It produces the maximum likelihood estimator, one- and two-sided p-values for the test that model parameters are zero, and providing confidence intervals for the parameters. The discrete probability model is given by a set of possible values of the random vectors, and null weights for these vectors. Such a discrete probability model arises in logistic regression, and this function is envisioned to be applied to the results of a network algorithm for conditional logistic regression. Examples apply this to data from Hirji et al. (1987), citing Goorin et al. (1987).

Usage

inference(
  netout,
  alpha = 0.05,
  rng = c(-5, 5),
  alternative = c("two.sided", "less", "greater")
)
inference(
  netout,
  alpha = 0.05,
  rng = c(-5, 5),
  alternative = c("two.sided", "less", "greater")
)

Arguments

`netout`	List of the sort provided by network.
`alpha`	Test level, or 1- confidence level.
`rng`	Range of possible parameter values.
`alternative`	String indicating two- or one-sided alternative, and, if one-sided, direction.

Value

List of outputs, including

ospv Observed one-sided p values
tspv Observed two-sided p value.
ci confidence interval.
estimate Maximum conditional likelihood estimator.
null.value Value of parameter under null hypothesis.
data.name Name of data set
method Method used to generate test.
statistic sufficient statistic value for inference variable.
p.value p.value
conf.int confidence interval.
alternative String indicating two- or one-sided alternative, and, if one-sided, direction.

and including standard stats:::orint.htest components, and of class htest.

References

Hirji KF, Mehta CR, Patel NR (1987). “Computing Distributions for Exact Logistic Regression.” Journal of the American Statistical Association, 82(400), pp. 1110-1117. ISSN 01621459, doi:10.2307/2289388.

Goorin AM, Perez–Atayde A, Gebhardt M, Andersen J (1987). “Weekly High–Dose Methotrexate and Doxorubicin for Osteosarcoma: The Dana–Farber Cancer Institute/The Children's Hospital – Study III.” Journal of Clinical Oncology. doi:10.1200/JCO.1987.5.8.1178.

Examples

#Columns in table are:
# Lymphocytic Infiltration (1=low, 0=high)
# Sex (1=male, 0=female)
# Any Ostioid Pathology (1=yes, 0=no)
# Number in LI-Sex-AOP group
# Number in LI-Sex-AOP group with disease free interval greater than 3 y
goorin<-data.frame(LI=c(0,0,0,0,1,1,1,1),Sex=c(0,0,1,1,0,0,1,1),
   AOP=c(0,1,0,1,0,1,0,1),N=c(3,2,4,1,5,5,9,17),Y=c(3,2,4,1,5,3,5,6))

netout<-network(goorin[,1:3],goorin[,4],conditionon=1:3,resp=goorin[,5])
inference(netout)

#Columns in table are:
# Lymphocytic Infiltration (1=low, 0=high)
# Sex (1=male, 0=female)
# Any Ostioid Pathology (1=yes, 0=no)
# Number in LI-Sex-AOP group
# Number in LI-Sex-AOP group with disease free interval greater than 3 y
goorin<-data.frame(LI=c(0,0,0,0,1,1,1,1),Sex=c(0,0,1,1,0,0,1,1),
   AOP=c(0,1,0,1,0,1,0,1),N=c(3,2,4,1,5,5,9,17),Y=c(3,2,4,1,5,3,5,6))

netout<-network(goorin[,1:3],goorin[,4],conditionon=1:3,resp=goorin[,5])
inference(netout)

Assess the accuracy of the log rank statistic approximation to the true value, in the case without censoring. Provides plots of statistics, and empirical test level.

Description

Assess the accuracy of the log rank statistic approximation to the true value, in the case without censoring. Provides plots of statistics, and empirical test level.

Usage

lrapproximations(nobs = 10, ratio = 1, nsamp = 1000)
lrapproximations(nobs = 10, ratio = 1, nsamp = 1000)

Arguments

`nobs`	number of observations in each group. This currently supports only equal group size data sets.
`ratio`	Ratio of group means; use 1 for null.
`nsamp`	Monte Carlo sample size.

Value

a vector of empirical test sizes.

Examples

lrapproximations(nsamp=100)
lrapproximations(nsamp=100)

This function enumerates conditional sample spaces associated with logistic regression,

Description

This function uses a network algorithm to enumerate conditional sample spaces associated with logistic regression, using a minimal version of the algorithm of Hirji et al. (1987).

Usage

network(
  dm,
  n = NULL,
  resp = NULL,
  conditionon = NULL,
  sst = NULL,
  addint = TRUE,
  verbose = FALSE,
  data.name = "Test data"
)
network(
  dm,
  n = NULL,
  resp = NULL,
  conditionon = NULL,
  sst = NULL,
  addint = TRUE,
  verbose = FALSE,
  data.name = "Test data"
)

Arguments

`dm`	matrix of covariates
`n`	Vector of number of trials. If null, make them all ones.
`resp`	vector of successes. Used only to calculate the sufficient statistics, unless sufficient statistics are entered directly. Either resp or sst must be provided.
`conditionon`	indices of covariate matrix indicating sufficient statistics to be conditioned on.
`sst`	sufficient statistic vector, if input directly. Otherwise, recomputed from resp.
`addint`	logical, true if a column of 1s must be added to the covariate matrix.
`verbose`	logical; if true, print intermediate results.
`data.name`	Name of the data set.

Details

Examples apply this to data from Hirji et al. (1987), citing Goorin et al. (1987).

Value

For a successful run, a list with components:

possible matrix with vectors of possible unconditioned values of the sufficient statistic.
count count of entries in the conditional distribution.
obsd Observed value of unconditioned sufficient statistics.

For an unsuccessful run (because of input inconsistencies) NA

References

Examples

#Columns in table are:
# Lymphocytic Infiltration (1=low, 0=high)
# Sex (1=male, 0=female)
# Any Ostioid Pathology (1=yes, 0=no)
# Number in LI-Sex-AOP group
# Number in LI-Sex-AOP group with disease free interval greater than 3 y
goorin<-data.frame(LI=c(0,0,0,0,1,1,1,1),Sex=c(0,0,1,1,0,0,1,1),
   AOP=c(0,1,0,1,0,1,0,1),N=c(3,2,4,1,5,5,9,17),Y=c(3,2,4,1,5,3,5,6))

out<-network(goorin[,1:3],goorin[,4],conditionon=1:3,resp=goorin[,5])
inference(out)

#Columns in table are:
# Lymphocytic Infiltration (1=low, 0=high)
# Sex (1=male, 0=female)
# Any Ostioid Pathology (1=yes, 0=no)
# Number in LI-Sex-AOP group
# Number in LI-Sex-AOP group with disease free interval greater than 3 y
goorin<-data.frame(LI=c(0,0,0,0,1,1,1,1),Sex=c(0,0,1,1,0,0,1,1),
   AOP=c(0,1,0,1,0,1,0,1),N=c(3,2,4,1,5,5,9,17),Y=c(3,2,4,1,5,3,5,6))

out<-network(goorin[,1:3],goorin[,4],conditionon=1:3,resp=goorin[,5])
inference(out)

Proportional hazards partial likelihood, using Breslow method for ties, excluding some observations.

Description

This function implements the approximate conditional inferential approach of Kolassa and Zhang (2019) to proportional hazards regression.

Usage

newllk(
  beta,
  fit,
  exclude = NULL,
  minus = FALSE,
  keeponly = NULL,
  justd0 = FALSE,
  cc1 = 0
)
newllk(
  beta,
  fit,
  exclude = NULL,
  minus = FALSE,
  keeponly = NULL,
  justd0 = FALSE,
  cc1 = 0
)

Arguments

`beta`	parameter vector.
`fit`	Output from a Cox PH regression, with x=TRUE and y=TRUE
`exclude`	data set with stratum and patient number to exclude.
`minus`	logical flag to change sign of partial likelyhood
`keeponly`	variables to retain. Keep all if this is null or NA.
`justd0`	logical variable, indicating whether to calculate only the function value and skip derivatives.
`cc1`	Continuity correction for first component of the score.

Value

a list with components

d0 partial likelihood
d1 first derivative vector
d2 second derivative matrix

References

PHInfiniteEstimates: Tools for Proportional Hazards Estimation, and Inference on the Associate Parameters, when Other Parameters are Estimated at Infinity.

Description

The PHInfiniteEstimates package Proportional hazards estimation in the presence of partial likelihood monitonicity has difficulties, in that finite estimators do not exist. These difficulties are related to those arising from logistic regression, addressed by (Kolassa 1997), and multinomial regression, addressed by (Kolassa 2016). Algorithms to provide conditionally similar problems in these contexts are provided.

References

Kolassa JE (1997). “Infinite Parameter Estimates in Logistic Regression.” Scandinavian Journal of Statistics, 24, 523–530. doi:10.1111/1467-9469.00078.

Kolassa JE (2016). “Inference in the Presence of Likelihood Monotonicity for Polytomous and Logistic Regression.” Advances in Pure Mathematics, 6, 331-341. doi:10.4236/apm.2016.65024.

Partial likelihood for proportional hazards

Description

Partial likelihood for proportional hazards

Usage

pllk(beta, xmat, ind, cc = NULL)
pllk(beta, xmat, ind, cc = NULL)

Arguments

`beta`	parameter vector
`xmat`	regression matrix
`ind`	censoring indicator, 1 for event and any other value otherwise.
`cc`	Continuity correction for sum of x vectors with multiple occurrences in risk set. For binary covariates is half. Default a vector of zeros.

Value

a list with components

d0 partial likelihood
d1 first derivative vector
d2 second derivative matrix

Examples

#Uses data set breast from package coxphf.
data(breast)
 xmat<-as.matrix(breast)[order(breast$TIME),c("T","N")]
 ind<-breast$CENS[order(breast$TIME)]
 short<-coxph(Surv(TIME,CENS)~ T+ N,data=breast)
 pllk(as.vector(coef(short)),xmat,ind)
#Uses data set breast from package coxphf.
data(breast)
 xmat<-as.matrix(breast)[order(breast$TIME),c("T","N")]
 ind<-breast$CENS[order(breast$TIME)]
 short<-coxph(Surv(TIME,CENS)~ T+ N,data=breast)
 pllk(as.vector(coef(short)),xmat,ind)

Reduce a logistic regression with monotone likelihood to a conditional regression with double descending likelihood.

Description

Reduce a logistic regression with monotone likelihood to a conditional regression with double descending likelihood.

Usage

reduceLR(Z, nvec = NULL, yvec = NULL, keep, sst = NULL, verybig = 1e+07)
reduceLR(Z, nvec = NULL, yvec = NULL, keep, sst = NULL, verybig = 1e+07)

Arguments

`Z`	regression matrix
`nvec`	vector of sample sizes
`yvec`	vector of responses
`keep`	vector of variable names to block from consideration for removal.
`sst`	vector of sufficient statistics
`verybig`	threshold for condition number to declare colinearity.

Details

This function implements version of Kolassa (1997). It is intended for use with extensions to multinomial regression as in Kolassa (1997) and to survival analysis as in Kolassa and Zhang (2019). The method involves linear optimization that is potentially repeated. Initial calculations were done using a proprietary coding of the simplex, in a way that allowed for later iterations to be restarted from earlier iterations; this computational advantage is not employed here, in favor of computational tools in the public domain and included in the R package lpSolve. Furthermore, Kolassa (1997) removed regressors that became linearly dependent using orthogonalization, but on further reflection this computation is unnecessary. Data in the examples are from Hirji et al. (1987), citing Goorin et al. (1987).

Value

a list with components

keepme indicators of which variables are retained in the reduced data set
moderate indicatiors of which observations are retained in the reduced data set
extreme indicators of which observations are removed in the reduced data set
toosmall indicator of whether resulting data set is too small to fit the proportional hazards regression

References

Kolassa JE (1997). “Infinite Parameter Estimates in Logistic Regression.” Scandinavian Journal of Statistics, 24, 523–530. doi:10.1111/1467-9469.00078.

Kolassa JE (2016). “Inference in the Presence of Likelihood Monotonicity for Polytomous and Logistic Regression.” Advances in Pure Mathematics, 6, 331-341. doi:10.4236/apm.2016.65024.

Examples

#Cancer Data
Z<-cbind(rep(1,8),c(rep(0,4),rep(1,4)),rep(c(0,0,1,1),2),rep(c(0,1),4))
dimnames(Z)<-list(NULL,c("1","LI","SEX","AOP"))
nvec<-c(3,2,4,1,5,5,9,17); yvec<-c(3,2,4,1,5,3,5,6)
reduceLR(Z,nvec,yvec,c("SEX","AOP"))
#CD4, CD8 data
Z<-cbind(1,c(0,0,1,1,0,0,1,0),c(0,0,0,0,1,1,0,1),c(0,0,0,0,0,1,1,0),c(0,1,0,1,0,0,0,1))
dimnames(Z)<-list(NULL,c("1","CD41","CD42","CD81","CD82"))
nvec<-c(7,1,7,2,2,13,12,3); yvec<-c(4,1,2,2,0,0,4,1)
reduceLR(Z,nvec,yvec,"CD41")
#Cancer Data
Z<-cbind(rep(1,8),c(rep(0,4),rep(1,4)),rep(c(0,0,1,1),2),rep(c(0,1),4))
dimnames(Z)<-list(NULL,c("1","LI","SEX","AOP"))
nvec<-c(3,2,4,1,5,5,9,17); yvec<-c(3,2,4,1,5,3,5,6)
reduceLR(Z,nvec,yvec,c("SEX","AOP"))
#CD4, CD8 data
Z<-cbind(1,c(0,0,1,1,0,0,1,0),c(0,0,0,0,1,1,0,1),c(0,0,0,0,0,1,1,0),c(0,1,0,1,0,0,0,1))
dimnames(Z)<-list(NULL,c("1","CD41","CD42","CD81","CD82"))
nvec<-c(7,1,7,2,2,13,12,3); yvec<-c(4,1,2,2,0,0,4,1)
reduceLR(Z,nvec,yvec,"CD41")

Simulate Weibull survival data from a model, perform reduction to remove infinite estimates, and calculate p values.

Description

Operating characteristics for the approximate conditional inferential approach to proportional hazards.

Usage

simcode(
  dataset,
  myformula,
  iv,
  ctime,
  nsamp = 10000,
  add = NULL,
  nobs = NA,
  half = FALSE,
  verbose = FALSE
)
simcode(
  dataset,
  myformula,
  iv,
  ctime,
  nsamp = 10000,
  add = NULL,
  nobs = NA,
  half = FALSE,
  verbose = FALSE
)

Arguments

`dataset`	the data set to use
`myformula`	the formula for the Cox regression
`iv`	name of the variable of interest, as a character string
`ctime`	fixed censoring time
`nsamp`	number of samples.
`add`	preliminary results, if any.
`nobs`	number of observations in target models, if different from that of dataset.
`half`	logical flag triggering a less extreme simulation by dividing the Weibull regression parameters in half.
`verbose`	logical flag triggering intermediate messaging.

Details

This function is intended to verify the operating characteristics of the approximate conditional inferential approach of Kolassa and Zhang (2019) to proportional hazards regression. A Weibull regression model, corresponding to the proportional hazards regression model, is fit to the data, and new data sets are simulated from this model. P-values are calculated for these new data sets, and their empirical distribution is compared to the theoretical uniform distribution.

Value

a list with components

out matrix with columns corresponding to p-values.
seed random seed
bad unused.
srreg parametric lifetime regression

References

Examples

data(breast)

breasttestp<-simcode(breast,Surv(TIME,CENS)~ T+ N+ G+ CD,"T",72,nsamp=100,verbose=TRUE)

data(breast)

breasttestp<-simcode(breast,Surv(TIME,CENS)~ T+ N+ G+ CD,"T",72,nsamp=100,verbose=TRUE)

Calculate simultaneous coverage of pointwise confidence intervals.

Description

Simulate exponential event times with expecation 1. Simulate censoring times with expectation 2. Calculate confidence intervals and check simultaneous coverage.

Usage

simultaneouscoverage(nsamp, nobs)
simultaneouscoverage(nsamp, nobs)

Arguments

`nsamp`	Number of Monte Carlo samples.
`nobs`	Number of observations per sample.

Value

Simultaneous coverage proportion.

Examples

simultaneouscoverage(1000,20)
simultaneouscoverage(1000,20)

Summarize proportional hazards fits

Description

Summarize proportional hazards fits

Usage

summarizefits(
  repairedfit,
  penalizedout,
  penalizedoutsmaller,
  iv,
  verbose = TRUE
)
summarizefits(
  repairedfit,
  penalizedout,
  penalizedoutsmaller,
  iv,
  verbose = TRUE
)

Arguments

`repairedfit`	coxph fit
`penalizedout`	coxphf fit
`penalizedoutsmaller`	smaller coxphf fit
`iv`	name of the variable of interest, as a character string
`verbose`	logical flag triggering intermediate messaging.

Value

a vector with components

Wald p-value from the Cox regression fit.
partial likelihood ratio p-value from Cox regression fit.
parameter estimate from the Cox regression fit.
standard error from the Cox regression fit.
Conditional Skovgaard standard error from the Cox regression fit.
Signed root of the partial likelihood ratio statistic from Cox regression fit.
partial likelihood ratio statistic p-value from coxphf
Wald p-value from coxphf
parameter estimate from coxphf
standard error from coxphf
number of parameters in reduced fit.

References

Summarize the results of simulations investigating operating conditions for the data reduction method to avoid monotone likelihood. Files are of form "hsxxx", for xxx numerals.

Description

Summarize the results of simulations investigating operating conditions for the data reduction method to avoid monotone likelihood. Files are of form "hsxxx", for xxx numerals.

Usage

summarizetable()
summarizetable()

Fit survival probabilties from a survreg object.

Description

Fit survival probabilties from a survreg object.

Usage

survregpredict(fit, newdata, time)
survregpredict(fit, newdata, time)

Arguments

`fit`	a survreg object. This should not contain strata(). It also must use the log transformation.
`newdata`	a new data set with covariates from the fit.
`time`	a time value (on the original, and not log, scale).

Examples

#Fit the survival probability for an individual with extent 1 and
#differentiation 2 at 700 days from a Weibull regression using the 
#colon cancer data set distributed as part of the survival package.
fit<-survreg(Surv(time,status)~factor(extent)+differ,data=colon)
survregpredict(fit,data.frame(extent=1,differ=2),700)
#Fit the survival probability for an individual with extent 1 and
#differentiation 2 at 700 days from a Weibull regression using the 
#colon cancer data set distributed as part of the survival package.
fit<-survreg(Surv(time,status)~factor(extent)+differ,data=colon)
survregpredict(fit,data.frame(extent=1,differ=2),700)

Test size of asymptotic Cox tests.

Description

Test size of asymptotic Cox tests.

Usage

testcox(nsamp = 1000, nobs = 50, ncov = 5, randomcov = TRUE)
testcox(nsamp = 1000, nobs = 50, ncov = 5, randomcov = TRUE)

Arguments

`nsamp`	Number of MC samples
`nobs`	Number of observations
`ncov`	Number of covariates
`randomcov`	Indicator of whether to draw random covariates.

Value

level of two-sided test of nominal size 0.05.

Subset of British elections data used in (Kolassa 2016). Data are from (Sanders et al. 2007).

Description

Subset of British elections data used in (Kolassa 2016). Data are from (Sanders et al. 2007).

Usage

data(voter.ml)
data(voter.ml)

References

Sanders DJ, Whiteley PF, Clarke HD, Stewart M, Winters K (2007). “The British Election Study.” https://www.britishelectionstudy.com.

Kolassa JE (2016). “Inference in the Presence of Likelihood Monotonicity for Polytomous and Logistic Regression.” Advances in Pure Mathematics, 6, 331-341. doi:10.4236/apm.2016.65024.

Examples

data(voter.ml)
data(voter.ml)

Package 'PHInfiniteEstimates'

Help Index

Description

Usage

Arguments

Value

Plot hazards for two strata for each time. At times with an event in one but not the other group, the fitted hazard remains constant, and so the plot is a step function. If hazards are proportional between strata, then the plot should be close to a straight line.

Description

Usage

Arguments

Examine the potential role of treatment in treatment in a model already including sex. Straight lines that are not 45 degrees indicate the appropriateness of new variable as a linear effect.

Description

Usage

Arguments

Newton Raphson Fitter for partial likelihood

Description

Usage

Arguments

Value

References

Examples

Check how censoring impacts sampling properties of KM fit and log rank test.

Description

Usage

Arguments

Value

Produce a graphical assessment of Monte Carlo experiment on fidelity of proportional hazards regression to the uniform ideal.

Description

Usage

Arguments

Value

Plot resuts of simcode

Description

Usage

Arguments

Value

Simulate from a competing risk model with correlated log normal errors, and plot various estimates.

Description

Usage

Arguments

Convert a baseline logit model data set, formatted in the long form as described in the documentation for mlogit.data from mlogit package, to a conditional logistic regression.

Description

Usage

Arguments

Details

Value

References

Examples

Convert a polytomous regression to a conditional logistic regression.

Description

Usage

Arguments

Details

Value

References

Examples

Convert a proportional hazards regression to a multinomial regression.

Description

Usage

Arguments

Details

Value

References

Draw diagram for toy PH example.

Description

Usage

Value

Remove observations from a proportional hazards regression, and return the fit of the reduced model.

Description

Usage

Arguments

Value

References

Examples

Perform Gehan's application to the Wilcoxon test for multiple samples, testing for equivalance of survival curve. See Klein and Moeschberger (1997) Survival Analysis (7.3.3) and pp. 193-194.

Description

Usage

Arguments

Value

Examples