[go: nahoru, domu]

Skip to content

This R package provides the following functions to modify gtsummary baseline tables: add SMDs (for any number of groups and for any variable type) & round counts to avoid microdata

Notifications You must be signed in to change notification settings

zheer-kejlberg/Z.gtsummary.addons

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Functions to modify gtsummary baseline tables

Zheer Kejlberg Al-Mashhadi 2023-11-19

Adding SMDs and rounding counts and percentages

In this document, the functions in the package are presented with examples. First, the round_5_gtsummary() function is demonstrated, followed by the add_SMD() function



Installing the package

install.packages("devtools")
devtools::install_github("zheer-kejlberg/Z.gtsummary.addons")
library(Z.gtsummary.addons)


add_SMD()


For unweighted data (with a tbl_summary() object), see the following examples:

For one SMD per variable, the location argument does not need to be specified (as it defaults to “label”).

trial %>% 
  tbl_summary(by = grade, include = c(age, stage)) %>%
  add_SMD()
Characteristic I, N = 681 II, N = 681 III, N = 641 SMD: I vs. II SMD: I vs. III SMD: II vs. III
Age 47 (37, 56) 49 (37, 57) 47 (38, 58) -0.10 -0.13 -0.04
    Unknown 2 6 3


T Stage


0.29 0.19 0.31
    T1 17 (25%) 23 (34%) 13 (20%)


    T2 18 (26%) 17 (25%) 19 (30%)


    T3 18 (26%) 11 (16%) 14 (22%)


    T4 15 (22%) 17 (25%) 18 (28%)


1 Median (IQR); n (%)

For one SMD per level of every categorical variable, you must specify location = “level”

trial %>% 
  tbl_summary(by = grade, include = c(age, stage)) %>%
  add_SMD(location = "level")
Characteristic I, N = 681 II, N = 681 III, N = 641 SMD: I vs. II SMD: I vs. III SMD: II vs. III
Age 47 (37, 56) 49 (37, 57) 47 (38, 58)


    Unknown 2 6 3


T Stage





    T1 17 (25%) 23 (34%) 13 (20%) -0.19 0.11 0.31
    T2 18 (26%) 17 (25%) 19 (30%) 0.03 -0.07 -0.11
    T3 18 (26%) 11 (16%) 14 (22%) 0.25 0.11 -0.15
    T4 15 (22%) 17 (25%) 18 (28%) -0.07 -0.14 -0.07
1 Median (IQR); n (%)
*Notice, this only gives SMDs on levels of categorical variables.

There’s also the option to set location = “both” to get both kinds of SMDs simultaneously.

trial %>%
  tbl_summary(by = grade, include = c(age, stage)) %>%
  add_SMD(location = "both")
Characteristic I, N = 681 II, N = 681 III, N = 641 SMD: I vs. II SMD: I vs. III SMD: II vs. III
Age 47 (37, 56) 49 (37, 57) 47 (38, 58) -0.10 -0.13 -0.04
    Unknown 2 6 3
T Stage


0.29 0.19 0.31
    T1 17 (25%) 23 (34%) 13 (20%) -0.19 0.11 0.31
    T2 18 (26%) 17 (25%) 19 (30%) 0.03 -0.07 -0.11
    T3 18 (26%) 11 (16%) 14 (22%) 0.25 0.11 -0.15
    T4 15 (22%) 17 (25%) 18 (28%) -0.07 -0.14 -0.07
1 Median (IQR); n (%)

To get confidence intervals, add ci = TRUE. With the decimals argument, you can adjust the number of significant digits displayed.

trial %>% 
  tbl_summary(by = grade, include = c(age, stage)) %>%
  add_SMD(location = "level", ci = TRUE, decimals = 3)
Characteristic I, N = 681 II, N = 681 III, N = 641 SMD: I vs. II SMD: I vs. III SMD: II vs. III
Age 47 (37, 56) 49 (37, 57) 47 (38, 58)


    Unknown 2 6 3


T Stage





    T1 17 (25%) 23 (34%) 13 (20%) -0.195 (-0.531, 0.142) 0.112 (-0.229, 0.454) 0.308 (-0.036, 0.651)
    T2 18 (26%) 17 (25%) 19 (30%) 0.034 (-0.303, 0.370) -0.072 (-0.413, 0.270) -0.105 (-0.447, 0.236)
    T3 18 (26%) 11 (16%) 14 (22%) 0.253 (-0.084, 0.591) 0.107 (-0.234, 0.449) -0.146 (-0.487, 0.196)
    T4 15 (22%) 17 (25%) 18 (28%) -0.069 (-0.406, 0.267) -0.140 (-0.482, 0.202) -0.071 (-0.412, 0.271)
1 Median (IQR); n (%)

To change the formatting of the confidence intervals, use the ci_bracket and ci_sep arguments:

  trial %>% 
  tbl_summary(by = grade, include = c(age, stage)) %>%
  add_SMD(location = "level", ci = TRUE, ci_bracket="[]", ci_sep=";")
Characteristic I, N = 681 II, N = 681 III, N = 641 SMD: I vs. II SMD: I vs. III SMD: II vs. III
Age 47 (37, 56) 49 (37, 57) 47 (38, 58)


    Unknown 2 6 3


T Stage





    T1 17 (25%) 23 (34%) 13 (20%) -0.19 [-0.53;0.14] 0.11 [-0.23;0.45] 0.31 [-0.04;0.65]
    T2 18 (26%) 17 (25%) 19 (30%) 0.03 [-0.30;0.37] -0.07 [-0.41;0.27] -0.11 [-0.45;0.24]
    T3 18 (26%) 11 (16%) 14 (22%) 0.25 [-0.08;0.59] 0.11 [-0.23;0.45] -0.15 [-0.49;0.20]
    T4 15 (22%) 17 (25%) 18 (28%) -0.07 [-0.41;0.27] -0.14 [-0.48;0.20] -0.07 [-0.41;0.27]
1 Median (IQR); n (%)


For weighted data, use tbl_svysummary():

In this example, we use weights from WeightIt package. The survey package delivers the necessary svydesign object.

library(WeightIt) # To calculate weights
library(survey) # To create a surveydesign object (a "weighted" dataset)

Application of the add_SMD() function is identical to the non-weighted case, but it is applied to a tbl_svysummary object instead of a tbl_summary object.

trial %>% mutate(
  w = weightit(grade ~ age + stage + trt, data = ., focal="I")$weights) %>% # create ATT weights
  survey::svydesign(~1, data = ., weights = ~w) %>% # create the svydesign object
  tbl_svysummary(by = grade, include = c(age, stage)) %>%
  add_SMD(ref_group = TRUE)
Characteristic I, N = 681 II, N = 681 III, N = 681 SMD: I vs. II SMD: I vs. III
Age 47 (37, 56) 46 (34, 56) 45 (38, 54) -0.01 -0.01
    Unknown 2 2 2

T Stage


0.08 0.02
    T1 17 (25%) 19 (27%) 17 (25%)

    T2 18 (26%) 19 (27%) 17 (26%)

    T3 18 (26%) 17 (26%) 18 (27%)

    T4 15 (22%) 13 (19%) 15 (23%)

1 Median (IQR); n (%)
*Notice, comparisons are only made here between group I and all other groups due to the use of ref_group = TRUE



round_5_gtsummary()


The function can be run on both tbl_summary and tbl_svysummary objects.


For comparison, here’s a table without rounding:

trial %>% 
  tbl_summary(by = grade, include = c(trt, age, stage))
Characteristic I, N = 681 II, N = 681 III, N = 641
Chemotherapy Treatment


    Drug A 35 (51%) 32 (47%) 31 (48%)
    Drug B 33 (49%) 36 (53%) 33 (52%)
Age 47 (37, 56) 49 (37, 57) 47 (38, 58)
    Unknown 2 6 3
T Stage


    T1 17 (25%) 23 (34%) 13 (20%)
    T2 18 (26%) 17 (25%) 19 (30%)
    T3 18 (26%) 11 (16%) 14 (22%)
    T4 15 (22%) 17 (25%) 18 (28%)
1 n (%); Median (IQR)

Now, the same table but with all counts rounded to nearest 5 (and all proportions adjusted accordingly):

trial %>% 
  tbl_summary(by = grade, include = c(trt, age, stage)) %>%
  round_5_gtsummary()
Characteristic I, N = 701 II, N = 701 III, N = 651
Chemotherapy Treatment


    Drug A 35 (50%) 30 (42.9%) 30 (46.2%)
    Drug B 35 (50%) 35 (50%) 35 (53.8%)
Age 47 (37, 56) 49 (37, 57) 47 (38, 58)
    Unknown <5 (<7.1%) 5 (7.1%) <5 (<7.7%)
T Stage


    T1 15 (21.4%) 25 (35.7%) 15 (23.1%)
    T2 20 (28.6%) 15 (21.4%) 20 (30.8%)
    T3 20 (28.6%) 10 (14.3%) 15 (23.1%)
    T4 15 (21.4%) 15 (21.4%) 20 (30.8%)
1 n (%); Median (IQR)

Finally, the function can also be applied to weighted data:

library(WeightIt) # for calculating weights
weighted_trial <- trial %>% 
  mutate(w = weightit(grade ~ trt + age + stage, estimand = "ATT", focal = "I")$weights) %>%
  survey::svydesign(~1, data = ., weights = ~w)

weighted_trial %>%
  tbl_svysummary(by = grade, include = c(trt, age, stage)) %>% 
  round_5_gtsummary()
Characteristic I, N = 701 II, N = 701 III, N = 701
Chemotherapy Treatment


    Drug A 35 (50%) 35 (50%) 35 (50%)
    Drug B 35 (50%) 30 (42.9%) 35 (50%)
Age 47 (37, 56) 46 (34, 56) 45 (38, 54)
    Unknown <5 (<7.1%) <5 (<7.1%) <5 (<7.1%)
T Stage


    T1 15 (21.4%) 20 (28.6%) 15 (21.4%)
    T2 20 (28.6%) 20 (28.6%) 15 (21.4%)
    T3 20 (28.6%) 15 (21.4%) 20 (28.6%)
    T4 15 (21.4%) 15 (21.4%) 15 (21.4%)
1 n (%); Median (IQR)

About

This R package provides the following functions to modify gtsummary baseline tables: add SMDs (for any number of groups and for any variable type) & round counts to avoid microdata

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages