Lycopene is a vegetable and fruit antioxidant that has ubiquitous health benefits. This study used public use National Health and Nutrition Examination Survey (NHANES III) data to investigate if high plasma lycopene concentration reduced mortality in adults.
Patients and Methods
NHANES III complex probabilistic household adult, laboratory and mortality data were merged. Specialized survey analysis software was used. Plasma lycopene concentration was analyzed with potential demographic, socioeconomic and health status cofounders.
The significant univariabes were: age, plasma lycopene concentration (LYPSI), poverty income ratios (DMPPIR), and drinking hard liquor. After multivariate analysis limited to sample persons 45 years or older, the predictors remained significant after adjustment were: age, LYPSI, Mexican American race and DMPPIR.
Serum lycopene was associated with 88% risk reduction in all cause mortality in adults 45 years or older. Including lycopene rich products in diets may be beneficial for middle and older aged adults.
Keywords: NHANES III; Serum Lycopene; All Cause Mortality; adults; Older Than 45 years
Lycopene is an important non-toxic red pigment that has been used in food coloring. The chemical properties and methods of extraction are currently actively studied . Lycopene is a polyunsaturated fatty molecule that occurs in a large variety of fruits and vegetables . In Western diets, tomato is a major source of lycopene. In Southeast Asia, their main staple and fruit GAC  has 50 times higher in lycopene content per gram of wet fruit (http://en.wikipedia.org/ wiki/Lycopene#cite_note-14) than tomato. In epidemiology studies lycopene has been found to have various health benefits because of its anti-oxidant effects . Lycopene has been found to decrease risk of prostate cancer  , and the risk of uterine cancer . Lycopene is associated with bone turnover biomarker and improves bone health by decreasing body’s oxidative stresses . Prospective trial has shown that lycopene increased other biological serum anti-oxidative mediators . It has been suggested there was potential beneficial health effects associated with high serum lycopene level, and a Japanese trial has been conducted to increase the serum lycopene concentration[ 8]. Tomato juice is also part of the Mediterranean diet that was stopped early because of a significant reduction in the risk of developing cardiovascular disease . This study, was a part of a series screening for potential chemicals with beneficial health effects, took advantage of the vastness of the public use NHANES III (National Health and Nutrition Examination Survey) data to investigate if high serum lycopene concentration decreased the mortality of US adults.
Materials and Methods
NHANES and NHANES III
NHANES is a major program of National Center of Health Statistics (a part of Center of Disease Control (CDC) of United States of America) started in 1971. NHANES III is a national study based on a complex, multi-stage probability sampling design. For details of NHANES data and statistical guidance as well as their analysis examples see NHANES website (http:// www.cdc.gov/nchs/nhanes.htm). NHANES studies were approved by CDC internal institutional review boards. The public use data are made available to the public and researchers. The NHANES sample weights were calculated to represent non-institutionalized general US population to account for non-coverage and non-response. These patients were interviewed at
home and examined in mobile examination centers (MEC). This eliminated the cofounding effects of sample persons being too frail, too young or old to go to the MEC for examinations. In this study, NHANES III (conducted between 1988 – 1994) household adult data file was merged with NHANES III laboratory data and the NHANES III linked cancer mortality data.
NHNAES III linked mortality data
NHANES III participants were followed passively until December 31, 2006 for their mortality data. Detailed information about the data and analysis guidelines are available at their website (http://www.cdc.gov/nchs/data_access/data_linkage/ mortality/nhanes3_linkage.htm). In brief, probability matching was used to link NHANES III with National Death Index for vital status and mortality, age 90 years old was censored because they contribute little in person years. NHANES used multiple sources including the use of death certificates and with the National Death Index to ascertain vital status and cause of death (UCOD_113).
NHANES III employed a complex sampling strategy and analysis [10-13]. Matlab programs (posted on Matlab File Exchange) were developed to convert SAS files provided by NAHNES to STATA programs to download NHANES III data files for further analysis. Specialized survey software is needed for NHANES complex data analysis . STATA 12 (College Station, TX) was among those recommended by CDC to analyze the complex NHANES data and was used in this study. The sampling weight used was WTPFEX6 because only the sample persons had examinations in the MEC were included in this study, SDPPSU6 was used for the probability sampling unit (PSU) and SDPSTRA6 was used to designate the strata for the STATA survey commands. STATA scripts were written for this analysis, and will be submitted for publication separately. Univariate and multivariate logistic regressions  were used to study the relationship between serum lycopene concentration in S.I. unit (LYPSI) and all cause in adults (17 years or older). The status of mortality was coded as a binary outcome (1= death, 0 = otherwise). Linearized Taylor Standard Error estimation was used. The covariates and the corresponding NHANES III codes used were: LYPSI (umol/L), MXPAXTMR (age at the MEC final examination in months), HSSEX (sex, _IHSSEX_1 = male, female as the reference group when applicable), HAM6S (weight in lbs without clothes), DMPMETRO (urban rural residence status), _IDMPMETRO_2 (rural residence, urban residence was used as the reference group), DMARETHN (race and ethnicity, _IDMARETHN_ 2 = non-Hispanic black, _IDMARETHN_3 = Mexican Americans, _IDMARETHN_4 = others, non-Hispanic white was used as the reference group), DMPPIR (poverty index ratio), HAN6JS (alcohol consumption, number of hard liquor drinks per month), and HAR4S (smoking, number cigarettes per day). For STATA analyses, only the patients without missing values for all of WTPFEX6, SDPPSU6, SDPSTRA6, LYPSI, MXPAXTMR, HSSEX, DMPMETRO, HAM6S, DMARETHN, DMPPIR, HAR4S, and HAN6JS were included in this study. Further, these additional NHANES III codes considered not eligible: LYPSI (8888), HAM6S (888), HAM6S (999), DMPPIR (888888), the numerator of DMPPIR was the midpoint of the observed family income category in the Family Questionnaire variable:HFF19R, and the denominator was the poverty threshold, the age of the family reference person, and the calender year in which the family was interviewed, HAR4S (666), HAR4S (777), HAR4S (888), HAR4S (999), HAN6JS (888), HAN6JS (999), not in BMI > 15 & BMI < 50, youth sample persons and incomplete mortality data. The analysis was performed with MXPAXTMR (age) as a continuous variable including all ages aboe 17 years old, and also limited to above 40 (during exploratory studies where the p-value of LYPSI was marginally significant), and 45 years old. A total of 1272sampel persons were eligible for this study.
There were 20024 cases in NHANES III linked mortality data file included in this study. 13944 cases were not available in the public use file to protect the privacy of youth subjects. 26 cases in the NHANES III linked dataset did not have mortality data. All cause mortality (5291 deaths out of 33994 subjects) was used as the binary outcomes for this analysis. The NHANES III adult data file and the NHANES III linked mortality file were merged according to the SEQN number provided by NHANES III to uniquely identify the cases. All the results were obtained by using survey command taking into account the primary sampling unit and stratification variables and the weights assigned to the sample persons examined in the MEC. Thus these results are representative of the US population. Table 1 shows the demographic, socioeconomic and other univariables used in this study. The mean risk of death (S.E.) was 0.39 (0.35 – 0.43); the mean follow up in months from the MEC examination (S.E.) was 150.085 months (144.77 - 155.40); the mean body mass index (kg/m^2) (S.E.) was 25.68 (25.28 – 26.08); the mean lycopene concentration (umol/L) was 0.38 (0.36 -0.40); the mean poverty income ratio (S.E.) was 3.11 (2.90 - 3.32); the mean number of hard liquors drinks per month (S.E.) was 3.60 (2.79 - 4.41); and the mean number of cigarettes smoked was 21.80 (20.56 - 23.04).
Table 1. Demographic, socioeconomic and health status covariables in adults 45 years or older. IndicatorDeath: 0=alive, 1=dead. Linearized Taylor Standard Error estimation was used. The NHANES III codes used were: body mass index, MXPAXTMR (age at the MEC final examination), HSSEX (sex), LYPSI (serum lycopene concentration in S.I. units), DMPMETRO (urban rural residence status), HAM6S (weight in lbs without clothes), DMARETHN (race and ethnicity), DMPPIR (poverty index ratio), HAN6JS (alcohol consumption), HAR4S (smoking), and permth_exm (months of follow up from MEC examination). n = 1272 samples.
For all cause mortality, the significant univariabes, odds ratios (p-value) were: age, 1.011 (1.0088 - 1.013); plasma lycopene concentration (LYPSI), 0.13, (0.047 - 0 .36); poverty income ratios (DMPPIR), 0 .81 (0.75 - 0.87); and drinking hard liquor, 1.024 (1.011 - 1.037). In multivariate analyses using age as a continuous variable for 17 years and older, LYPSI did not reach statistical significance. After multivariate analysis limited to sample persons 45 years or older, the predictors remained significant after adjustment of the covariates, odds ratios (p-value) were: age, 1.011 ((1.0086 - 1.013); LYPSI, 0.33 (0.12 – 0.91; Mexican Americans (using non-Hispanic whites as the reference group), 0.61 (0.41 - 0.91); and DMPPIR, 0.85 (0.78 - 0.92).
Lycopene is a red carotene and carotenoid pigment that occurs in a large amount in GAC fruit that is a main staple in Southeast Asia diets. It is a non-toxic, vegetable and fruit antioxidant that has been used in food coloring. A major source of lycopene in Western diet is tomato and tomato products. It is thought to have ubiquitous health benefits. Lycopene has been used in chemoprevention  such as prostate cancer [17-21] and bladder cancer  chemoprevention. There are active studies investigating the role of lycopene in carcinogenesis  and nutrigenetics . However, Food and Drug Administration has a highly limited and qualified claim for the benefits of eating tomato and tomato products (http://en.wikipedia. org/wiki/Lycopene). Against this background, this study used public use National Health and Nutrition Examination Survey (NHANES III) data to investigate the association between plasma lycopene concentration and mortality in adults.
The results presented here were analyzed taking into account the complex probabilistic sampling. Thus these results are representative of the US non-institutionalized population as designed by NHANES. There were 1272 sample persons had complete data and were used in this analysis. Analyses using age as a continuous variable and included all of the sample persons older than 17 years old, serum lycopene concentration was not a significant predictor of all cause mortality in
adults above 17 years old. When the age group was limited to include those sample persons older than 40 years old, the odds ratio was 0.31, a p-value of 0.057, and a 95% confidence interval of 0 .092 to 1.035. When the age was limited to above 45 years old, the serum lycopene concentration became an independent predictor of all cause mortality. Only the analysis results when the age was limited to older than 45 years old are presented here. The mean risk of death for these sample persons older than 45 years old was 0.39 after a mean follow up of 12.5 years (Table 1). These subjects has a mean body mass index marginally normal (25.68, Table 1). Their mean lycopene concentration (umol/L) was 0.38. The average sample person had a income about 3 times above the poverty level (Table 1). They drank 3.60 glasses of hard liquors per month and smoked 21.80 cigarettes per day.
For all cause mortality, the significant univariabes (Table 2) for this group (>45 years or older) were age, plasma lycopene concentration, poverty income ratios and drinking hard liquor. Multivariate analysis (Table 3) obtained age, serum lycopene concentration, Mexican American (relative to non-Hispanic whites), and poverty income ratio remained significant after adjustment. The effects of racial disparities  and the adverse effects of smoking and drinking  on mortality have
been reported and are supported by this study.
Table 2. Univariate analysis of lycopene and covariates of all cause morality in adults 45 years or older. IndicatorDeath: 0=alive, 1=dead. Linearized Taylor Standard Error estimation was used. The NHANES III codes used were: body mass index, MXPAXTMR (age at the MEC final examination), HSSEX (sex), LYPSI (serum lycopene concentration in S.I. units), DMPMETRO (urban rural residence status), HAM6S (weight in lbs without clothes), DMARETHN (race and ethnicity), DMPPIR (poverty index ratio), HAN6JS (alcohol consumption), and HAR4S (smoking). n = 1272 samples.
Although lycopene has been found to be beneficial to specific patients groups, the benefit of high plasma lycopene concentration on reducing the risk of all cause mortality in adults has not been well documented. This study showed that high serum lycopene concentration was associated with 88% risk reduction in all cause mortality in adults 45 years or older, and there was no benefits of having a higher plasma lycopene concentration on reducing all cause mortality for younger persons.
This study supports including lycopene rich products such as tomato in diets for middle aged and older adults.
Table 3. Multivariate analysis of serum lycopene concentration and covariates as predictors of all cause mortality in adults 45 years or older. Indicator Death: 0=alive, 1=dead. Linearized Taylor Standard Error estimation was used. The NHANES III codes used were: BMI (body mass index), HSSEX (_IHSSEX2 = female, using male as the reference group), LYPSI (serum lycopene concentration in S.I. units), MXPAXTMR (age at the MEC final examination), HAM6S (weight in lbs without clothes), DMPMETRO (urban rural residence status, _IDMPMETRO_ 2 = rural residence, urban residence used as the reference group), DMARETHN (race and ethnicity, _IDMARETHN_2 = non-Hispanic black, _IDMARETHN_3 = Mexicans, _IDMARETHN_4 = others, non-Hispanic white used as the reference group), DMPPIR (poverty index ratio), HAN6JS (alcohol consumption), and HAR4S (smoking). n = 1272 samples.