Effect of compositional data in the multivariate analysis of sterol concentrations in river sediments
Abstract
In this paper, multivariate analysis of sterol concentrations detected in river sediment samples was performed. In order to remove co-dependence of values, concentrations of sterols were transformed using centered log-ratio (CLR) transformation. The main objective of the work was to point out the damaging effects of working in the wrong geometry on the principal component analysis (PCA) assessment of sterol pollution. In order to determine if the dimension lost have effect on the principal component analysis of sterols in sediments, we have performed the PCA using raw and log-ratio transformed sterol data. Additionally, two rounded zero replacement approaches, i.e. a simple-substitution method (DL/2, 0.55DL and DL/root 2) and multiplicative replacement strategy (0.65 DL), were compared in order to determine if the replacement values have an effect on PCA results and conclusions. Relevant differences were noted by comparing the results of the principal component analysis obtained with r...aw data and log-ratio transformed sterol data. Only the PC loadings obtained from the CLR PCA allowed the clear distinction between human-sourced pollution and biogenic sources of sterols, whereas in the case of PCA with raw data loadings were all grouped almost in a single quadrant. For the small proportion of rounded zeros (not more than 10%), two different replacement approaches did not have any effect on transformed PCA output. The results presented in this work have shown that the effect of "closure" in the sterol data can be easily observed from the PCA biplot, and that it obstructs the evaluation of human contribution to pollution of river sediments. Therefore, prior to the PCA, sterol concentrations must be CLR transformed in order to perform a reliable assessment of the sewage contamination.
Keywords:
Centered log-ratio transformation / Principal component analysis / Fecal contamination / Sterols / River sedimentsSource:
Microchemical Journal, 2018, 139, 188-195Publisher:
- Elsevier Science Bv, Amsterdam
Funding / projects:
DOI: 10.1016/j.microc.2018.02.031
ISSN: 0026-265X
WoS: 000433268700023
Scopus: 2-s2.0-85042638972
Institution/Community
Tehnološko-metalurški fakultetTY - JOUR AU - Antanasijević, Davor AU - Matić-Bujagić, Ivana AU - Grujić, Svetlana AU - Laušević, Mila PY - 2018 UR - http://TechnoRep.tmf.bg.ac.rs/handle/123456789/4020 AB - In this paper, multivariate analysis of sterol concentrations detected in river sediment samples was performed. In order to remove co-dependence of values, concentrations of sterols were transformed using centered log-ratio (CLR) transformation. The main objective of the work was to point out the damaging effects of working in the wrong geometry on the principal component analysis (PCA) assessment of sterol pollution. In order to determine if the dimension lost have effect on the principal component analysis of sterols in sediments, we have performed the PCA using raw and log-ratio transformed sterol data. Additionally, two rounded zero replacement approaches, i.e. a simple-substitution method (DL/2, 0.55DL and DL/root 2) and multiplicative replacement strategy (0.65 DL), were compared in order to determine if the replacement values have an effect on PCA results and conclusions. Relevant differences were noted by comparing the results of the principal component analysis obtained with raw data and log-ratio transformed sterol data. Only the PC loadings obtained from the CLR PCA allowed the clear distinction between human-sourced pollution and biogenic sources of sterols, whereas in the case of PCA with raw data loadings were all grouped almost in a single quadrant. For the small proportion of rounded zeros (not more than 10%), two different replacement approaches did not have any effect on transformed PCA output. The results presented in this work have shown that the effect of "closure" in the sterol data can be easily observed from the PCA biplot, and that it obstructs the evaluation of human contribution to pollution of river sediments. Therefore, prior to the PCA, sterol concentrations must be CLR transformed in order to perform a reliable assessment of the sewage contamination. PB - Elsevier Science Bv, Amsterdam T2 - Microchemical Journal T1 - Effect of compositional data in the multivariate analysis of sterol concentrations in river sediments EP - 195 SP - 188 VL - 139 DO - 10.1016/j.microc.2018.02.031 ER -
@article{ author = "Antanasijević, Davor and Matić-Bujagić, Ivana and Grujić, Svetlana and Laušević, Mila", year = "2018", abstract = "In this paper, multivariate analysis of sterol concentrations detected in river sediment samples was performed. In order to remove co-dependence of values, concentrations of sterols were transformed using centered log-ratio (CLR) transformation. The main objective of the work was to point out the damaging effects of working in the wrong geometry on the principal component analysis (PCA) assessment of sterol pollution. In order to determine if the dimension lost have effect on the principal component analysis of sterols in sediments, we have performed the PCA using raw and log-ratio transformed sterol data. Additionally, two rounded zero replacement approaches, i.e. a simple-substitution method (DL/2, 0.55DL and DL/root 2) and multiplicative replacement strategy (0.65 DL), were compared in order to determine if the replacement values have an effect on PCA results and conclusions. Relevant differences were noted by comparing the results of the principal component analysis obtained with raw data and log-ratio transformed sterol data. Only the PC loadings obtained from the CLR PCA allowed the clear distinction between human-sourced pollution and biogenic sources of sterols, whereas in the case of PCA with raw data loadings were all grouped almost in a single quadrant. For the small proportion of rounded zeros (not more than 10%), two different replacement approaches did not have any effect on transformed PCA output. The results presented in this work have shown that the effect of "closure" in the sterol data can be easily observed from the PCA biplot, and that it obstructs the evaluation of human contribution to pollution of river sediments. Therefore, prior to the PCA, sterol concentrations must be CLR transformed in order to perform a reliable assessment of the sewage contamination.", publisher = "Elsevier Science Bv, Amsterdam", journal = "Microchemical Journal", title = "Effect of compositional data in the multivariate analysis of sterol concentrations in river sediments", pages = "195-188", volume = "139", doi = "10.1016/j.microc.2018.02.031" }
Antanasijević, D., Matić-Bujagić, I., Grujić, S.,& Laušević, M.. (2018). Effect of compositional data in the multivariate analysis of sterol concentrations in river sediments. in Microchemical Journal Elsevier Science Bv, Amsterdam., 139, 188-195. https://doi.org/10.1016/j.microc.2018.02.031
Antanasijević D, Matić-Bujagić I, Grujić S, Laušević M. Effect of compositional data in the multivariate analysis of sterol concentrations in river sediments. in Microchemical Journal. 2018;139:188-195. doi:10.1016/j.microc.2018.02.031 .
Antanasijević, Davor, Matić-Bujagić, Ivana, Grujić, Svetlana, Laušević, Mila, "Effect of compositional data in the multivariate analysis of sterol concentrations in river sediments" in Microchemical Journal, 139 (2018):188-195, https://doi.org/10.1016/j.microc.2018.02.031 . .