Title: | An Efficient Workflow for Plausibility Checks and Prevalence Analysis of Wasting in R |
---|---|
Description: | A simple and streamlined workflow for plausibility checks and prevalence analysis of wasting based on the Standardized Monitoring and Assessment of Relief and Transition (SMART) Methodology <https://smartmethodology.org/>, with application in R. |
Authors: | Tomás Zaba [aut, cre, cph] |
Maintainer: | Tomás Zaba <[email protected]> |
License: | GPL (>= 3) |
Version: | 0.2.1.9000 |
Built: | 2025-03-07 06:21:41 UTC |
Source: | https://github.com/nutriverse/mwana |
anthro.01
is a two-stage cluster-based survey with probability of selection
of clusters proportional to the size of the population. The survey employed
the SMART methodology.
anthro.01
anthro.01
A tibble of 1,191 rows and 11 columns.
Variable | Description |
area | Survey location |
dos | Survey date |
cluster | Primary sampling unit |
team | Enumerator IDs |
sex | Sex; "m" = boys, "f" = girls |
dob | Date of birth |
age | Age in months, typically estimated using local event calendars |
weight | Weight in kilograms |
height | Height in centimetres |
edema | Edema; "n" = no edema, "y" = with edema |
muac | Mid-upper arm circumference in millimetres |
Anonymous
anthro.01
anthro.01
A household budget survey data conducted in Mozambique in 2019/2020, known as IOF (Inquérito ao Orçamento Familiar in Portuguese). IOF is a two-stage cluster-based survey, representative at province level (second administrative level), with probability of the selection of the clusters proportional to the size of the population. Its data collection spans for a period of 12 months.
anthro.02
anthro.02
A tibble of 2,267 rows and 14 columns.
Variable | Description |
province | The administrative unit level 1 where data was collected |
strata | Rural or Urban |
cluster | Primary sampling unit |
sex | Sex; "m" = boys, "f" = girls |
age | Calculated age in months with two decimal places |
weight | Weight in kilograms |
height | Height in centimetres |
edema | Edema; "n" = no edema, "y" = with edema |
muac | Mid-upper arm circumference in millimetres |
wtfactor | Survey weights |
wfhz | Weight-for-height z-scores with 3 decimal places |
flag_wfhz | Flagged WFHZ value. 1 = flagged, 0 = not flagged |
mfaz | MUAC-for-age z-scores with 3 decimal places |
flag_mfaz | Flagged MFAZ value. 1 = flagged, 0 = not flagged |
Mozambique National Institute of Statistics. The data is publicly available at https://mozdata.ine.gov.mz/index.php/catalog/88#metadata-data_access. Data was wrangled using this package's wranglers. Details about survey design can be read from: https://mozdata.ine.gov.mz/index.php/catalog/88#metadata-sampling
anthro.02
anthro.02
anthro.03
contains survey data of four districts. Each district data set
presents distinct data quality scenarios that require a specific prevalence
analysis approach. Data from two districts have a problematic WFHZ standard
deviation. The data from the remaining two districts are all within range.
This sample data is useful to demonstrate the use of the prevalence functions on a multiple-domain survey data where there can be variations in the rating of acceptability of the standard deviation, hence requiring different analytical approach for each survey domain to ensure accurate estimation.
anthro.03
anthro.03
A tibble of 943 x 9.
Variable | Description |
district | Survey location |
cluster | Primary sampling unit |
team | Survey teams |
sex | Sex; "m" = boys, "f" = girls |
age | Calculated age in months with two decimal places |
weight | Weight in kilograms |
height | Height in centimetres |
edema | Edema; "n" = no edema, "y" = with edema |
muac | Mid-upper arm circumference in millimetres |
Anonymous
anthro.03
anthro.03
Data was collected from community-based sentinel sites located across three provinces. Each provincial data set presents distinct data quality scenarios, requiring tailored prevalence analysis:
Province 1 has a MUAC-for-age z-score standard deviation and age ratio test rating of acceptability falling within range
Province 2 has age ratio rated as problematic but with an acceptable standard deviation of MUAC-for-age z-score
"Province 3 has both tests rated as problematic
This sample data is useful to demonstrate the use of the prevalence functions on a multiple-domain survey data where variations in the rating of acceptability of the standard deviation exist, hence require different analytical approach for each domain to ensure accurate estimation.
anthro.04
anthro.04
A tibble of 3,002 x 8.
Variable | Description |
province | Survey location |
cluster | Primary sampling unit |
sex | Sex; "m" = boys, "f" = girls |
age | Calculated age in months with two decimal places |
muac | Mid-upper arm circumference in millimetres |
edema | Edema; "n" = no edema, "y" = with edema |
mfaz | MUAC-for-age z-scores with 3 decimal places |
flag_mfaz | Flagged MUAC-for-age z-score value; 1 = flagged, 0 = not flagged |
Anonymous
anthro.04
anthro.04
Determine if a given observation in the data set is wasted or not, and its respective form of wasting (global, severe or moderate) on the basis of z-scores of weight-for-height (WFHZ), muac-for-age (MFAZ), raw MUAC values and combined case-definition.
define_wasting( df, zscores = NULL, muac = NULL, edema = NULL, .by = c("zscores", "muac", "combined") )
define_wasting( df, zscores = NULL, muac = NULL, edema = NULL, .by = c("zscores", "muac", "combined") )
df |
A |
zscores |
A vector of class |
muac |
An |
edema |
A |
.by |
A choice of the criterion by which a case is to be defined. Choose "zscores" for WFHZ or MFAZ, "muac" for raw MUAC and "combined" for combined. Default value is "zscores". |
The tibble
object df
with additional columns named named gam
,
sam
and mam
, each of class numeric
containing coded values of either
1 (case) and 0 (not a case). If .by = "combined"
, additional columns are
named cgam
, csam
and cmam
.
## Case-definition by z-scores ---- z <- anthro.02 |> define_wasting( zscores = wfhz, muac = NULL, edema = edema, .by = "zscores" ) head(z) ## Case-definition by MUAC ---- m <- anthro.02 |> define_wasting( zscores = NULL, muac = muac, edema = edema, .by = "muac" ) head(m) ## Case-definition by combined ---- c <- anthro.02 |> define_wasting( zscores = wfhz, muac = muac, edema = edema, .by = "combined" ) head(c)
## Case-definition by z-scores ---- z <- anthro.02 |> define_wasting( zscores = wfhz, muac = NULL, edema = edema, .by = "zscores" ) head(z) ## Case-definition by MUAC ---- m <- anthro.02 |> define_wasting( zscores = NULL, muac = muac, edema = edema, .by = "muac" ) head(m) ## Case-definition by combined ---- c <- anthro.02 |> define_wasting( zscores = wfhz, muac = muac, edema = edema, .by = "combined" ) head(c)
Identify outlier z-scores for weight-for-height (WFHZ) and MUAC-for-age (MFAZ) following the SMART methodology. The function can also be used to detect outliers for height-for-age (HFAZ) and weight-for-age (WFAZ) z-scores following the same approach.
For flagging z-scores, z-scores that deviate substantially from the sample's z-score mean are considered outliers and are unlikely to reflect accurate measurements. For raw MUAC, values that are less than 100 millimeters or greater than 200 millimeters are considered outliers as recommended by Bilukha & Kianian (2023). Including these values in the analysis could compromise the accuracy of the resulting estimates.
To remove outliers, their values are set to NA rather than removing the
record from the dataset. This process is also called censoring. By
assigning NA values to these outliers, they can be effectively removed
during statistical operations with functions that allow for removal of NA
values such as mean()
for getting the mean value or sd()
for getting the
standard deviation.
flag_outliers(x, .from = c("zscores", "raw_muac")) remove_flags(x, .from = c("zscores", "raw_muac"))
flag_outliers(x, .from = c("zscores", "raw_muac")) remove_flags(x, .from = c("zscores", "raw_muac"))
x |
A |
.from |
Either "zscores" or "raw_muac" for type of data to flag outliers from. |
An vector of the same length as x
of flagged records coded as
1
for a flagged record and 0
for a non-flagged record.
Bilukha, O., & Kianian, B. (2023). Considerations for assessment of measurement quality of mid‐upper arm circumference data in anthropometric surveys and mass nutritional screenings conducted in humanitarian and refugee settings. Maternal & Child Nutrition, 19, e13478. Available at https://doi.org/10.1111/mcn.13478
SMART Initiative (2017). Standardized Monitoring and Assessment for Relief and Transition. Manual 2.0. Available at: https://smartmethodology.org.
## Sample data of raw MUAC values ---- x <- anthro.01$muac ## Apply the function with `.from` set to "raw_muac" ---- m <- flag_outliers(x, .from = "raw_muac") head(m) ## Sample data of z-scores (be it WFHZ, MFAZ, HFAZ or WFAZ) ---- x <- anthro.02$mfaz # Apply the function with `.from` set to "zscores" ---- z <- flag_outliers(x, .from = "zscores") tail(z) ## With `.from` set to "zscores" ---- z <- remove_flags( x = wfhz.01$wfhz, .from = "zscores" ) head(z) ## With `.from` set to "raw_muac" ---- m <- remove_flags( x = mfaz.01$muac, .from = "raw_muac" ) tail(m)
## Sample data of raw MUAC values ---- x <- anthro.01$muac ## Apply the function with `.from` set to "raw_muac" ---- m <- flag_outliers(x, .from = "raw_muac") head(m) ## Sample data of z-scores (be it WFHZ, MFAZ, HFAZ or WFAZ) ---- x <- anthro.02$mfaz # Apply the function with `.from` set to "zscores" ---- z <- flag_outliers(x, .from = "zscores") tail(z) ## With `.from` set to "zscores" ---- z <- remove_flags( x = wfhz.01$wfhz, .from = "zscores" ) head(z) ## With `.from` set to "raw_muac" ---- m <- remove_flags( x = mfaz.01$muac, .from = "raw_muac" ) tail(m)
Calculate child's age in months based on the date of birth and the date of data collection.
get_age_months(dos, dob)
get_age_months(dos, dob)
dos |
A |
dob |
A |
A numeric
vector of child's age in months. Any value less
than 6.0 and greater than or equal to 60.0 months are set to NA.
## Take two vectors of class "Date" ---- surv_date <- as.Date( c( "2024-01-05", "2024-01-05", "2024-01-05", "2024-01-08", "2024-01-08", "2024-01-08", "2024-01-10", "2024-01-10", "2024-01-10", "2024-01-11" ) ) bir_date <- as.Date( c( "2022-04-04", "2021-05-01", "2023-05-24", "2017-12-12", NA, "2020-12-12", "2022-04-04", "2021-05-01", "2023-05-24", "2020-12-12" ) ) ## Apply the function ---- get_age_months( dos = surv_date, dob = bir_date )
## Take two vectors of class "Date" ---- surv_date <- as.Date( c( "2024-01-05", "2024-01-05", "2024-01-05", "2024-01-08", "2024-01-08", "2024-01-08", "2024-01-10", "2024-01-10", "2024-01-10", "2024-01-11" ) ) bir_date <- as.Date( c( "2022-04-04", "2021-05-01", "2023-05-24", "2017-12-12", NA, "2020-12-12", "2022-04-04", "2021-05-01", "2023-05-24", "2020-12-12" ) ) ## Apply the function ---- get_age_months( dos = surv_date, dob = bir_date )
A sample mid-upper arm circumference (MUAC) screening data
mfaz.01
mfaz.01
A tibble with 661 rows and 4 columns.
Variable | Description |
sex | Sex; "m" = boys, "f" = girls |
months | Calculated age in months with two decimal places |
edema | Edema, "n" = no edema, "y" = with edema |
muac | Mid-upper arm circumference in millimetres |
Anonymous
mfaz.01
mfaz.01
A sample SMART survey data with mid-upper arm circumference measurements
mfaz.02
mfaz.02
A tibble with 303 rows and 7 columns.
Variable | Description |
cluster | Primary sampling unit |
sex | Sex; "m" = boys, "f" = girls |
age | Calculated age in months with two decimal places |
edema | Edema, "n" = no edema, "y" = with edema |
mfaz | MUAC-for-age z-scores with 3 decimal places |
flag_mfaz | Flagged MUAC-for-age z-score value. 1 = flagged, 0 = not flagged |
Anonymous
mfaz.02
mfaz.02
Data for estimating the prevalence of acute malnutrition used in the IPC AMN can come from different sources: surveys, screenings or community-based surveillance systems. The IPC has set minimum sample size requirements for each source. This function verifies whether these requirements are met.
mw_check_ipcamn_ssreq(df, cluster, .source = c("survey", "screening", "ssite"))
mw_check_ipcamn_ssreq(df, cluster, .source = c("survey", "screening", "ssite"))
df |
A |
cluster |
A vector of class |
.source |
The source of evidence. A choice between "survey" for representative survey data at the area of analysis; "screening" for screening data; "ssite" for community-based sentinel site data. Default value is "survey". |
A single row summary tibble
with 3 columns containing
check results for:
n_clusters
- the total number of unique clusters or
screening or site identifiers;
n_obs
- the corresponding total number of children in the data set; and,
meet_ipc
- whether the IPC AMN requirements were met.
IPC Global Partners. 2021. Integrated Food Security Phase Classification Technical Manual Version 3.1.Evidence and Standards for Better Food Security and Nutrition Decisions. Rome. Available at: https://www.ipcinfo.org/ipcinfo-website/resources/ipc-manual/en/.
mw_check_ipcamn_ssreq( df = anthro.01, cluster = cluster, .source = "survey" )
mw_check_ipcamn_ssreq( df = anthro.01, cluster = cluster, .source = "survey" )
Estimate the prevalence of wasting based on the combined case-definition of weight-for-height z-scores (WFHZ), MUAC and/or edema. The function allows users to estimate prevalence in accordance with complex sample design properties such as accounting for survey sample weights when needed or applicable. The quality of the data is first evaluated by calculating and rating the standard deviation of WFHZ and MFAZ and the p-value of the age ratio test. Prevalence is calculated only when all tests are rated as not problematic. If any of the tests rate as problematic, no estimation is done and an NA value is returned. Outliers are detected in both WFHZ and MFAZ datasets based on SMART flagging criteria. Identified outliers are then excluded before prevalence estimation is performed.
mw_estimate_prevalence_combined(df, wt = NULL, edema = NULL, .by = NULL)
mw_estimate_prevalence_combined(df, wt = NULL, edema = NULL, .by = NULL)
df |
A |
wt |
A vector of class |
edema |
A |
.by |
A |
A concept of combined flags is introduced in this function. Any observation
that is flagged for either flag_wfhz
or flag_mfaz
is flagged under a new
variable named cflags
added to df
. This ensures that all flagged
observations from both WFHZ and MFAZ data are excluded from the prevalence
analysis.
flag_wfhz | flag_mfaz | cflags |
1 | 0 | 1 |
0 | 1 | 1 |
0 | 0 | 0 |
A summary tibble
for the descriptive statistics about combined
wasting.
## When .by and wt are set to NULL ---- mw_estimate_prevalence_combined( df = anthro.02, wt = NULL, edema = edema, .by = NULL ) ## When wt is not set to NULL ---- mw_estimate_prevalence_combined( df = anthro.02, wt = wtfactor, edema = edema, .by = NULL )
## When .by and wt are set to NULL ---- mw_estimate_prevalence_combined( df = anthro.02, wt = NULL, edema = edema, .by = NULL ) ## When wt is not set to NULL ---- mw_estimate_prevalence_combined( df = anthro.02, wt = wtfactor, edema = edema, .by = NULL )
Calculate the prevalence estimates of wasting based on z-scores of MUAC-for-age and/or bilateral edema. The function allows users to estimate prevalence in accordance with complex sample design properties such as accounting for survey sample weights when needed or applicable. The quality of the data is first evaluated by calculating and rating the standard deviation of MFAZ. Standard approach to prevalence estimation is calculated only when the standard deviation of MFAZ is rated as not problematic. If the standard deviation is problematic, prevalence is estimated using the PROBIT estimator. Outliers are detected based on SMART flagging criteria. Identified outliers are then excluded before prevalence estimation is performed.
mw_estimate_prevalence_mfaz(df, wt = NULL, edema = NULL, .by = NULL)
mw_estimate_prevalence_mfaz(df, wt = NULL, edema = NULL, .by = NULL)
df |
A |
wt |
A vector of class |
edema |
A |
.by |
A |
A summary tibble
for the descriptive statistics about wasting.
## When .by = NULL ---- mw_estimate_prevalence_mfaz( df = anthro.04, wt = NULL, edema = edema, .by = NULL ) ## When .by is not set to NULL ---- mw_estimate_prevalence_mfaz( df = anthro.04, wt = NULL, edema = edema, .by = province )
## When .by = NULL ---- mw_estimate_prevalence_mfaz( df = anthro.04, wt = NULL, edema = edema, .by = NULL ) ## When .by is not set to NULL ---- mw_estimate_prevalence_mfaz( df = anthro.04, wt = NULL, edema = edema, .by = province )
Estimate the prevalence of wasting based on MUAC and/or nutritional edema. The function allows users to estimate prevalence in accordance with complex sample design properties such as accounting for survey sample weights when needed or applicable. The quality of the data is first evaluated by calculating and rating the standard deviation of MFAZ and the p-value of the age ratio test. Prevalence is calculated only when the standard deviation of MFAZ is not problematic. If both standard deviation of MFAZ and p-value of age ratio test is not problematic, straightforward prevalence estimation is performed. If standard deviation of MFAZ is not problematic but p-value of age ratio test is problematic, age-weighting is applied to prevalence estimation to account for the over-representation of younger children in the sample. If standard deviation of MFAZ is problematic, no estimation is done and an NA value is returned. Outliers are detected based on SMART flagging criteria for MFAZ. Identified outliers are then excluded before prevalence estimation is performed.
mw_estimate_prevalence_muac(df, wt = NULL, edema = NULL, .by = NULL) mw_estimate_smart_age_wt(df, edema = NULL, .by = NULL)
mw_estimate_prevalence_muac(df, wt = NULL, edema = NULL, .by = NULL) mw_estimate_smart_age_wt(df, edema = NULL, .by = NULL)
df |
A |
wt |
A vector of class |
edema |
A |
.by |
A |
A summary tibble
for the descriptive statistics about combined
wasting.
SMART Initiative (no date). Updated MUAC data collection tool. Available at: https://smartmethodology.org/survey-planning-tools/updated-muac-tool/
mw_estimate_smart_age_wt()
mw_estimate_prevalence_mfaz()
mw_estimate_prevalence_screening()
## When .by = NULL ---- mw_estimate_prevalence_muac( df = anthro.04, wt = NULL, edema = edema, .by = NULL ) ## When .by is not set to NULL ---- mw_estimate_prevalence_muac( df = anthro.04, wt = NULL, edema = edema, .by = province ) ## An application of `mw_estimate_smart_age_wt()` ---- .data <- anthro.04 |> subset(province == "Province 2") mw_estimate_smart_age_wt( df = .data, edema = edema, .by = NULL )
## When .by = NULL ---- mw_estimate_prevalence_muac( df = anthro.04, wt = NULL, edema = edema, .by = NULL ) ## When .by is not set to NULL ---- mw_estimate_prevalence_muac( df = anthro.04, wt = NULL, edema = edema, .by = province ) ## An application of `mw_estimate_smart_age_wt()` ---- .data <- anthro.04 |> subset(province == "Province 2") mw_estimate_smart_age_wt( df = .data, edema = edema, .by = NULL )
It is common to estimate prevalence of wasting from non survey data, such as screenings or any other community-based surveillance systems. In such situations, the analysis usually consists only in estimating the point prevalence and the counts of positive cases, without necessarily estimating the uncertainty. This function serves this use.
The quality of the data is first evaluated by calculating and rating the standard deviation of MFAZ and the p-value of the age ratio test. Prevalence is calculated only when the standard deviation of MFAZ is not problematic. If both standard deviation of MFAZ and p-value of age ratio test is not problematic, straightforward prevalence estimation is performed. If standard deviation of MFAZ is not problematic but p-value of age ratio test is problematic, age-weighting is applied to prevalence estimation to account for the over-representation of younger children in the sample. If standard deviation of MFAZ is problematic, no estimation is done and an NA value is returned. Outliers are detected based on SMART flagging criteria for MFAZ. Identified outliers are then excluded before prevalence estimation is performed.
mw_estimate_prevalence_screening(df, muac, edema = NULL, .by = NULL)
mw_estimate_prevalence_screening(df, muac, edema = NULL, .by = NULL)
df |
A |
muac |
A |
edema |
A |
.by |
A |
A summary tibble
for the descriptive statistics about combined
wasting.
SMART Initiative (no date). Updated MUAC data collection tool. Available at: https://smartmethodology.org/survey-planning-tools/updated-muac-tool/
mw_estimate_prevalence_muac()
mw_estimate_smart_age_wt()
mw_estimate_prevalence_screening( df = anthro.02, muac = muac, edema = edema, .by = province ) ## With `edema` set to `NULL` ---- mw_estimate_prevalence_screening( df = anthro.02, muac = muac, edema = NULL, .by = province ) ## With `.by` set to `NULL` ---- mw_estimate_prevalence_screening( df = anthro.02, muac = muac, edema = NULL, .by = NULL )
mw_estimate_prevalence_screening( df = anthro.02, muac = muac, edema = edema, .by = province ) ## With `edema` set to `NULL` ---- mw_estimate_prevalence_screening( df = anthro.02, muac = muac, edema = NULL, .by = province ) ## With `.by` set to `NULL` ---- mw_estimate_prevalence_screening( df = anthro.02, muac = muac, edema = NULL, .by = NULL )
Calculate the prevalence estimates of wasting based on z-scores of weight-for-height and/or nutritional edema. The function allows users to estimate prevalence in accordance with complex sample design properties such as accounting for survey sample weights when needed or applicable. The quality of the data is first evaluated by calculating and rating the standard deviation of WFHZ. Standard approach to prevalence estimation is calculated only when the standard deviation of MFAZ is rated as not problematic. If the standard deviation is problematic, prevalence is estimated using the PROBIT estimator. Outliers are detected based on SMART flagging criteria. Identified outliers are then excluded before prevalence estimation is performed.
mw_estimate_prevalence_wfhz(df, wt = NULL, edema = NULL, .by = NULL)
mw_estimate_prevalence_wfhz(df, wt = NULL, edema = NULL, .by = NULL)
df |
A |
wt |
A vector of class |
edema |
A |
.by |
A |
A summary tibble
for the descriptive statistics about wasting.
## When .by = NULL ---- ### Start off by wrangling the data ---- data <- mw_wrangle_wfhz( df = anthro.03, sex = sex, weight = weight, height = height, .recode_sex = TRUE ) ### Now run the prevalence function ---- mw_estimate_prevalence_wfhz( df = data, wt = NULL, edema = edema, .by = NULL ) ## Now when .by is not set to NULL ---- mw_estimate_prevalence_wfhz( df = data, wt = NULL, edema = edema, .by = district ) ## When a weighted analysis is needed ---- mw_estimate_prevalence_wfhz( df = anthro.02, wt = wtfactor, edema = edema, .by = province )
## When .by = NULL ---- ### Start off by wrangling the data ---- data <- mw_wrangle_wfhz( df = anthro.03, sex = sex, weight = weight, height = height, .recode_sex = TRUE ) ### Now run the prevalence function ---- mw_estimate_prevalence_wfhz( df = data, wt = NULL, edema = edema, .by = NULL ) ## Now when .by is not set to NULL ---- mw_estimate_prevalence_wfhz( df = data, wt = NULL, edema = edema, .by = district ) ## When a weighted analysis is needed ---- mw_estimate_prevalence_wfhz( df = anthro.02, wt = wtfactor, edema = edema, .by = province )
Converts scientific notations to standard notations, rounds off values, and renames columns to meaningful names.
mw_neat_output_mfaz(df)
mw_neat_output_mfaz(df)
df |
An |
A data.frame
object of the same length and width as df
, with column
names and values formatted as appropriate.
## First wrangle age data ---- data <- mw_wrangle_age( df = anthro.01, dos = dos, dob = dob, age = age, .decimals = 2 ) ## Then wrangle MUAC data ---- data_mfaz <- mw_wrangle_muac( df = data, sex = sex, age = age, muac = muac, .recode_sex = TRUE, .recode_muac = TRUE, .to = "cm" ) ## Then run plausibility check ---- pl <- mw_plausibility_check_mfaz( df = data_mfaz, flags = flag_mfaz, sex = sex, muac = muac, age = age ) ## Now neat the output table ---- mw_neat_output_mfaz(df = pl)
## First wrangle age data ---- data <- mw_wrangle_age( df = anthro.01, dos = dos, dob = dob, age = age, .decimals = 2 ) ## Then wrangle MUAC data ---- data_mfaz <- mw_wrangle_muac( df = data, sex = sex, age = age, muac = muac, .recode_sex = TRUE, .recode_muac = TRUE, .to = "cm" ) ## Then run plausibility check ---- pl <- mw_plausibility_check_mfaz( df = data_mfaz, flags = flag_mfaz, sex = sex, muac = muac, age = age ) ## Now neat the output table ---- mw_neat_output_mfaz(df = pl)
Converts scientific notations to standard notations, rounds off values, and renames columns to meaningful names.
mw_neat_output_muac(df)
mw_neat_output_muac(df)
df |
A |
A data.frame
object of the same length and width as df
, with column names
and values formatted for clarity and readability.
## First wrangle MUAC data ---- df_muac <- mw_wrangle_muac( df = anthro.01, sex = sex, muac = muac, age = NULL, .recode_sex = TRUE, .recode_muac = FALSE, .to = "none" ) ## Then run the plausibility check ---- pl_muac <- mw_plausibility_check_muac( df = df_muac, flags = flag_muac, sex = sex, muac = muac ) ## Neat the output table ---- mw_neat_output_muac(df = pl_muac)
## First wrangle MUAC data ---- df_muac <- mw_wrangle_muac( df = anthro.01, sex = sex, muac = muac, age = NULL, .recode_sex = TRUE, .recode_muac = FALSE, .to = "none" ) ## Then run the plausibility check ---- pl_muac <- mw_plausibility_check_muac( df = df_muac, flags = flag_muac, sex = sex, muac = muac ) ## Neat the output table ---- mw_neat_output_muac(df = pl_muac)
Converts scientific notations to standard notations, rounds off values, and renames columns to meaningful names.
mw_neat_output_wfhz(df)
mw_neat_output_wfhz(df)
df |
An |
A tibble
object of the same length and width as df
, with column names and
values formatted for clarity and readability.
## First wrangle age data ---- data <- mw_wrangle_age( df = anthro.01, dos = dos, dob = dob, age = age, .decimals = 2 ) ## Then wrangle WFHZ data ---- data_wfhz <- mw_wrangle_wfhz( df = data, sex = sex, weight = weight, height = height, .recode_sex = TRUE ) ## Now run the plausibility check ---- pl <- mw_plausibility_check_wfhz( df = data_wfhz, sex = sex, age = age, weight = weight, height = height, flags = flag_wfhz ) ## Now neat the output table ---- mw_neat_output_wfhz(df = pl)
## First wrangle age data ---- data <- mw_wrangle_age( df = anthro.01, dos = dos, dob = dob, age = age, .decimals = 2 ) ## Then wrangle WFHZ data ---- data_wfhz <- mw_wrangle_wfhz( df = data, sex = sex, weight = weight, height = height, .recode_sex = TRUE ) ## Now run the plausibility check ---- pl <- mw_plausibility_check_wfhz( df = data_wfhz, sex = sex, age = age, weight = weight, height = height, flags = flag_wfhz ) ## Now neat the output table ---- mw_neat_output_wfhz(df = pl)
Check the overall plausibility and acceptability of MFAZ data through a structured test suite encompassing checks for sampling and measurement-related biases in the dataset. This test suite follows the recommendation made by Bilukha & Kianian (2023) on the plausibility of constructing a comprehensive plausibility check for MUAC data similar to weight-for-height z-score to evaluate its acceptability when age values are available in the dataset.
The function works on a data.frame
returned from wrangling functions for
age and for MUAC-for-age z-score data available from this package.
mw_plausibility_check_mfaz(df, sex, muac, age, flags)
mw_plausibility_check_mfaz(df, sex, muac, age, flags)
df |
A |
sex |
A |
muac |
A |
age |
A vector of class |
flags |
A |
Whilst the function uses the same checks and criteria as those for weight-for-height z-scores in the SMART plausibility check, the percent of flagged records is evaluated using different cut-off points, with a maximum acceptability of 2.0% as shown below:
Excellent | Good | Acceptable | Problematic |
0.0 - 1.0 | >1.0 - 1.5 | >1.5 - 2.0 | >2.0 |
A single row summary tibble
with 17 columns containing the
plausibility check results and their respective acceptability ratings.
Bilukha, O., & Kianian, B. (2023). Considerations for assessment of measurement quality of mid‐upper arm circumference data in anthropometric surveys and mass nutritional screenings conducted in humanitarian and refugee settings. Maternal & Child Nutrition, 19, e13478. https://doi.org/10.1111/mcn.13478
SMART Initiative (2017). Standardized Monitoring and Assessment for Relief and Transition. Manual 2.0. Available at: https://smartmethodology.org.
mw_wrangle_age()
mw_wrangle_muac()
mw_stattest_ageratio()
flag_outliers()
## First wrangle age data ---- data <- mw_wrangle_age( df = anthro.01, dos = dos, dob = dob, age = age, .decimals = 2 ) ## Then wrangle MUAC data ---- data_muac <- mw_wrangle_muac( df = data, sex = sex, age = age, muac = muac, .recode_sex = TRUE, .recode_muac = TRUE, .to = "cm" ) ## And finally run plausibility check ---- mw_plausibility_check_mfaz( df = data_muac, flags = flag_mfaz, sex = sex, muac = muac, age = age )
## First wrangle age data ---- data <- mw_wrangle_age( df = anthro.01, dos = dos, dob = dob, age = age, .decimals = 2 ) ## Then wrangle MUAC data ---- data_muac <- mw_wrangle_muac( df = data, sex = sex, age = age, muac = muac, .recode_sex = TRUE, .recode_muac = TRUE, .to = "cm" ) ## And finally run plausibility check ---- mw_plausibility_check_mfaz( df = data_muac, flags = flag_mfaz, sex = sex, muac = muac, age = age )
Check the overall plausibility and acceptability of raw MUAC data through a structured test suite encompassing checks for sampling and measurement-related biases in the dataset. The test suite in this function follows the recommendation made by Bilukha & Kianian (2023).
mw_plausibility_check_muac(df, sex, muac, flags)
mw_plausibility_check_muac(df, sex, muac, flags)
df |
A |
sex |
A |
muac |
A vector of class |
flags |
A |
Cut-off points used for the percent of flagged records:
Excellent | Good | Acceptable | Problematic |
0.0 - 1.0 | >1.0 - 1.5 | >1.5 - 2.0 | >2.0 |
A single row summary tibble
with 9 columns containing the
plausibility check results and their respective acceptability ratings.
Bilukha, O., & Kianian, B. (2023). Considerations for assessment of measurement quality of mid‐upper arm circumference data in anthropometric surveys and mass nutritional screenings conducted in humanitarian and refugee settings. Maternal & Child Nutrition, 19, e13478. https://doi.org/10.1111/mcn.13478
SMART Initiative (2017). Standardized Monitoring and Assessment for Relief and Transition. Manual 2.0. Available at: https://smartmethodology.org.
mw_wrangle_muac()
flag_outliers()
## First wrangle MUAC data ---- df_muac <- mw_wrangle_muac( df = anthro.01, sex = sex, muac = muac, age = NULL, .recode_sex = TRUE, .recode_muac = FALSE, .to = "none" ) ## Then run the plausibility check ---- mw_plausibility_check_muac( df = df_muac, flags = flag_muac, sex = sex, muac = muac )
## First wrangle MUAC data ---- df_muac <- mw_wrangle_muac( df = anthro.01, sex = sex, muac = muac, age = NULL, .recode_sex = TRUE, .recode_muac = FALSE, .to = "none" ) ## Then run the plausibility check ---- mw_plausibility_check_muac( df = df_muac, flags = flag_muac, sex = sex, muac = muac )
Check the overall plausibility and acceptability of WFHZ data through a structured test suite encompassing checks for sampling and measurement-related biases in the dataset. The test suite, including the criteria and corresponding rating of acceptability, follows the standards in the SMART plausibility check.
The function works on a data frame returned by this package's wrangling functions for age and for WFHZ data.
mw_plausibility_check_wfhz(df, sex, age, weight, height, flags)
mw_plausibility_check_wfhz(df, sex, age, weight, height, flags)
df |
A |
sex |
A |
age |
A vector of class |
weight |
A vector of class |
height |
A vector of class |
flags |
A |
A single row summary tibble
with 19 columns for the plausibility check
results and their respective acceptability rates.
SMART Initiative (2017). Standardized Monitoring and Assessment for Relief and Transition. Manual 2.0. Available at: https://smartmethodology.org.
mw_plausibility_check_mfaz()
mw_plausibility_check_muac()
mw_wrangle_age()
## First wrangle age data ---- data <- mw_wrangle_age( df = anthro.01, dos = dos, dob = dob, age = age, .decimals = 2 ) ## Then wrangle WFHZ data ---- data_wfhz <- mw_wrangle_wfhz( df = data, sex = sex, weight = weight, height = height, .recode_sex = TRUE ) ## Now run the plausibility check ---- mw_plausibility_check_wfhz( df = data_wfhz, sex = sex, age = age, weight = weight, height = height, flags = flag_wfhz )
## First wrangle age data ---- data <- mw_wrangle_age( df = anthro.01, dos = dos, dob = dob, age = age, .decimals = 2 ) ## Then wrangle WFHZ data ---- data_wfhz <- mw_wrangle_wfhz( df = data, sex = sex, weight = weight, height = height, .recode_sex = TRUE ) ## Now run the plausibility check ---- mw_plausibility_check_wfhz( df = data_wfhz, sex = sex, age = age, weight = weight, height = height, flags = flag_wfhz )
Calculate the observed age ratio of children aged 24 to 59 months old over those aged 6 to 23 months old and test if there is a statistically significant difference between the observed and the expected.
mw_stattest_ageratio(age, .expectedP = 0.66)
mw_stattest_ageratio(age, .expectedP = 0.66)
age |
A |
.expectedP |
The expected proportion of children aged 24 to 59 months old over those aged 6 to 23 months old. By default, this is expected to be 0.66. |
This function should be used specifically when assessing the quality of MUAC
data. For age ratio test of children aged 6 to 29 months old over 30 to 59
months old, as performed in the SMART plausibility check, use
nipnTK::ageRatioTest()
instead.
A list
object with three elements: p
for p-value of the
difference between the observed and the expected proportion of children aged
24 to 59 months old over those aged 6 to 23 months old, observedR
for the
observed ratio, and observedP
for the observed proportion.
SMART Initiative. Updated MUAC data collection tool. Available at: https://smartmethodology.org/survey-planning-tools/updated-muac-tool/
mw_stattest_ageratio( age = anthro.02$age, .expectedP = 0.66 )
mw_stattest_ageratio( age = anthro.02$age, .expectedP = 0.66 )
Wrangle child's age for downstream analysis. This includes calculating age in months based on the date of data collection and the child's date of birth, and setting to NA the age values that are less than 6.0 and greater than or equal to 60.0 months old.
mw_wrangle_age(df, dos = NULL, dob = NULL, age, .decimals = 2)
mw_wrangle_age(df, dos = NULL, dob = NULL, age, .decimals = 2)
df |
A |
dos |
A |
dob |
A |
age |
A |
.decimals |
The number of decimal places to round off age to. Default is 2. |
A tibble
based on df
. The variable age
will be automatically
filled in each row where age value was missing and both the child's
date of birth and the date of data collection are available. Rows where age
is less than 6.0 and greater than or equal to 60.0 months old will be set to
NA. Additionally, a new variable named age_days
of class double
for
calculated age of child in days is added to df
.
## A sample data ---- df <- data.frame( surv_date = as.Date(c( "2023-01-01", "2023-01-01", "2023-01-01", "2023-01-01", "2023-01-01" )), birth_date = as.Date(c( "2019-01-01", NA, "2018-03-20", "2019-11-05", "2021-04-25" )), age = c(NA, 36, NA, NA, NA) ) ## Apply the function ---- mw_wrangle_age( df = df, dos = surv_date, dob = birth_date, age = age, .decimals = 3 )
## A sample data ---- df <- data.frame( surv_date = as.Date(c( "2023-01-01", "2023-01-01", "2023-01-01", "2023-01-01", "2023-01-01" )), birth_date = as.Date(c( "2019-01-01", NA, "2018-03-20", "2019-11-05", "2021-04-25" )), age = c(NA, 36, NA, NA, NA) ) ## Apply the function ---- mw_wrangle_age( df = df, dos = surv_date, dob = birth_date, age = age, .decimals = 3 )
Calculate z-scores for MUAC-for-age (MFAZ) and identify outliers based on
the SMART methodology. When age is not supplied, only outliers are detected
from the raw MUAC values. The function only works after age has gone through
mw_wrangle_age()
.
mw_wrangle_muac( df, sex, muac, age = NULL, .recode_sex = TRUE, .recode_muac = TRUE, .to = c("cm", "mm", "none"), .decimals = 3 )
mw_wrangle_muac( df, sex, muac, age = NULL, .recode_sex = TRUE, .recode_muac = TRUE, .to = c("cm", "mm", "none"), .decimals = 3 )
df |
A |
sex |
A |
muac |
A |
age |
A |
.recode_sex |
Logical. Set to TRUE if the values for |
.recode_muac |
Logical. Set to TRUE if the values for raw MUAC should be converted to either centimeters or millimeters. Otherwise, set to FALSE (default) |
.to |
A choice of the measuring unit to convert MUAC values into. Can be "cm" for centimeters, "mm" for millimeters, or "none" to leave as it is. |
.decimals |
The number of decimal places to use for z-score outputs. Default is 3. |
A tibble
based on df
. If age = NULL
, flag_muac
variable for
detected MUAC outliers based on raw MUAC is added to df
. Otherwise,
variables named mfaz
for child's MFAZ and flag_mfaz
for detected outliers
based on SMART guidelines are added to df
.
Bilukha, O., & Kianian, B. (2023). Considerations for assessment of measurement quality of mid‐upper arm circumference data in anthropometric surveys and mass nutritional screenings conducted in humanitarian and refugee settings. Maternal & Child Nutrition, 19, e13478. https://doi.org/10.1111/mcn.13478
SMART Initiative (2017). Standardized Monitoring and Assessment for Relief and Transition. Manual 2.0. Available at: https://smartmethodology.org.
flag_outliers()
remove_flags()
mw_wrangle_age()
## When age is available, wrangle it first before calling the function ---- w <- mw_wrangle_age( df = anthro.02, dos = NULL, dob = NULL, age = age, .decimals = 2 ) ### Then apply the function to wrangle MUAC data ---- mw_wrangle_muac( df = w, sex = sex, age = age, muac = muac, .recode_sex = TRUE, .recode_muac = TRUE, .to = "cm", .decimals = 3 ) ## When age is not available ---- mw_wrangle_muac( df = anthro.02, sex = sex, age = NULL, muac = muac, .recode_sex = TRUE, .recode_muac = TRUE, .to = "cm", .decimals = 3 )
## When age is available, wrangle it first before calling the function ---- w <- mw_wrangle_age( df = anthro.02, dos = NULL, dob = NULL, age = age, .decimals = 2 ) ### Then apply the function to wrangle MUAC data ---- mw_wrangle_muac( df = w, sex = sex, age = age, muac = muac, .recode_sex = TRUE, .recode_muac = TRUE, .to = "cm", .decimals = 3 ) ## When age is not available ---- mw_wrangle_muac( df = anthro.02, sex = sex, age = NULL, muac = muac, .recode_sex = TRUE, .recode_muac = TRUE, .to = "cm", .decimals = 3 )
Calculate z-scores for weight-for-height (WFHZ) and identify outliers based on the SMART methodology.
mw_wrangle_wfhz(df, sex, weight, height, .recode_sex = TRUE, .decimals = 3)
mw_wrangle_wfhz(df, sex, weight, height, .recode_sex = TRUE, .decimals = 3)
df |
A |
sex |
A |
weight |
A vector of class |
height |
A vector of class |
.recode_sex |
Logical. Set to TRUE if the values for |
.decimals |
The number of decimal places to use for z-score outputs. Default is 3. |
A data frame based on df
with new variables named wfhz
for
child's WFHZ and flag_wfhz
for detected outliers added.
SMART Initiative (2017). Standardized Monitoring and Assessment for Relief and Transition. Manual 2.0. Available at: https://smartmethodology.org.
flag_outliers()
remove_flags()
mw_wrangle_wfhz( df = anthro.01, sex = sex, weight = weight, height = height, .recode_sex = TRUE, .decimals = 2 )
mw_wrangle_wfhz( df = anthro.01, sex = sex, weight = weight, height = height, .recode_sex = TRUE, .decimals = 2 )
Convert MUAC values to either centimeters or millimeters
recode_muac(x, .to = c("cm", "mm"))
recode_muac(x, .to = c("cm", "mm"))
x |
A vector of raw MUAC values. The class can either be |
.to |
Either "cm" (centimeters) or "mm" (millimeters) for the unit of measurement to convert MUAC values to. |
A numeric
vector of the same length as x
with values set to
specified unit of measurement.
## Recode from millimeters to centimeters ---- muac_cm <- recode_muac( x = anthro.01$muac, .to = "cm" ) head(muac_cm) ## Using the `muac_cm` object to recode it back to "mm" ---- muac_mm <- recode_muac( x = muac_cm, .to = "mm" ) tail(muac_mm)
## Recode from millimeters to centimeters ---- muac_cm <- recode_muac( x = anthro.01$muac, .to = "cm" ) head(muac_cm) ## Using the `muac_cm` object to recode it back to "mm" ---- muac_mm <- recode_muac( x = muac_cm, .to = "mm" ) tail(muac_mm)
A sample SMART survey data with weight-for-height z-score standard deviation rated as problematic
wfhz.01
wfhz.01
A tibble with 303 rows and 6 columns.
Variable | Description |
cluster | Primary sampling unit |
sex | Sex; "m" = boys, "f" = girls |
age | Calculated age in months with two decimal places |
edema | Edema, "n" = no edema, "y" = with edema |
wfhz | MUAC-for-age z-scores with 3 decimal places |
flag_wfhz | Flagged weight-for-height z-score value; 1 = flagged, 0 = not flagged |
Anonymous
wfhz.01
wfhz.01