Increase Font Size Decrease Font Size View as PDF Print

 Lee SA, Shu XO, Li H, Yang G, Cai H, Wen W, Ji BT, Gao J, Gao YT, Zheng W. Adolescent and adult soy food intake and breast cancer risk: results from the Shanghai Women's Health Study. Am J Clin Nutr. 2009 Jun;89(6):1920-6.

PubMed ID: 19403632
Study Design:
Prospective Cohort Study
B - Click here for explanation of classification scheme.
POSITIVE: See Research Design and Implementation Criteria Checklist below.
Research Purpose:

To investigate the association of soy food intake with breast cancer risk using data from a prospective cohort study with a focus on evaluating the joint effect of soy food intake in adolescents and adults.

Inclusion Criteria:
  • Women who resided in seven urban communities of Shanghai
  • Ages between 40 and 70 years

PS: The complete inclusion criteria was published elsewhere

Exclusion Criteria:

Excluded if not included above

For this study, data excluded if:

  • history of breast cancer
  • reported energy intake < 500 or> 3500 kcal/day
  • lost-to-follow-up shortly after recruitment
Description of Study Protocol:

Recruitment : The women were recruited from seven urban communities in Shanghai between 1996 and 2000 and participated in the Shanghai Women's Health study with a follow-up until 2007.


Design: Prospective cohort study


Blinding used (if applicable): not applicable


Intervention (if applicable): not applicable


Statistical Analysis:

  • Low quintile served as a the reference group
  • Cox proportional hazards regression model was used to estimate the relative risks and CIs associated with soy. Age was used as the time scale in the proportional hazards regression model and stratified the model on birth cohort
  • Linear trend was evaluated by modeling categorical intake variables as ordinal variables in the model (in 5-years interval); covariates included in the model: educaton, physical activity, age at first live birth, body mass index, season of recruitment, family history of breast cancer, and total energy intake
  • Analyses were stratified by menopausal status to evaluate any modifying effect of menopause on soy intake. Menopausal status was updated during the follow-up surveys and treated as a time-varying variable in the analyses
  • When studying the combined effect of soy intake during adolescence and adulthood, soy food consumption was measured by soy protein intake and categorized by tertile distribution
  • Likelihood ratio test used to evaluate potential multiplicative interactions of two study variables by comparing the models with and without the cross product terms of these variables
  • Two-sided probability was carried out in all tests


Data Collection Summary:

Timing of Measurements:  Frequency Food questionnaire (FFQ) was assessed at the baseline survey and approximately after two to three years during the first follow-up survey. Sociodemographic factors, diet and lifestyle habits, menstrual and reproductive history, hormone use, and medical history, as well as, anthropometric measurements were taken at the baseline survey. Bienal surveys were carried on during 2000-2002, 2002-2004, and 2004-2007 in-person follow-up. Cancer and death certificate registries were followed annually 


Dependent Variables

  •  Breast cancer

Independent Variables

  •  Soy food intake: isoflavones and soy protein
    • defined in two ways: 1) intake assessed at baseline; 2) cumulative average intake derived from 2 FFQs administered at baseline and the first follow-up survey
      • A validated quantitative FFQ was used to assess usual dietary intake of soy. 
      • The FFQ covered virtually all soy foods consumed in urban Shanghai, including soy milk, tofu, soy products other than tofu, dried soybeans, soybean sprouts, and fresh soybeans.
      • Energy and nutrient intakes, including soy protein and isoflavone intakes were calculated. 
      • Dietary intake during adolescence (between ages of 13 and 15 years) was assessed by using a brief FFQ, including 19 raw food items or groups.

Control Variables

  • Age at menarche, menopause and first live birth
  • Nulliparous
  • Positive breast cancer family history
  • BMI
  • Waist-to-hip ratio
  • Total energy intake


Description of Actual Data Sample:


Initial N: 74,942

Attrition (final N): 73,225

Reasons for exclusion: women with extreme total energy intake (<500 or >3500 kcal/day) (n=124); women lost to follow-up shortly after recruitment (n=8); history of cancer (n=1576)

Age: mean age was 52 years

Ethnicity: Asian

Other relevant demographics:  

  • Significant differences in cases and noncases were found for
    • education (more cases went to college and fewer completed < elementary school),
    • usual occupation (more cases were i professional/technical positions versus more noncases in manufacturing/construction)
    • income (cases tended to have higher income)
    • age at menarche (cases versus noncases: 14.8+1.81 years versus 14.9+1.74 years, P<0.01)
    • menopause (cases versus noncases: 49.3+4.52 versus 48.6+.34 years, P<0.001)
    • first live birth ( cases versus noncases: 26.4+4.08 years versus 25.6+4.13 years, P<0.001),
    • nulliparous (4.2% versus 3.3%, N.S.)
    • positive breast cancer family history 3.5% versus 1.8%, P<0.01). 
  • Very few women were regular alcohol drinkers (1.9%), cigarrete smokers (2.4%), or hormone replacement therapy users (3.9%).


Cases versus noncases: 

  • BMI:  24.3+3.40 versus 24.0+3.42 kg/m2, P=0.02
  • Waist to hip ratio: 0.81+0.05 versus 0.81+0.05, P=0.09

Participants in the higher quintile (>12.82 soy food intake) were more likely to  have higher BMI (24.5+3.47 kg/m2 versus 23.6+3.46 kg/m2and waist-to-hip (0.815+0.054 versus 0.809+0.054)  ratios than the women with lower soy intake.

Location:Shanghai, China


Summary of Results:

Key Findings

  • Incident cases of breast cancer were 594 over a mean follow-up of 7.4 years.  Mean age at diagnosis=52.1+9.06 years.
  • Adult soy food consumption, either soy protein (RR=0.41;95%CI: 0.25,0.70) or isoflavones (RR=0.44; 95%CI:0.26,0.73) was inversely associated with the risk of premenopausal breast cancer; P<0.001, comparing the upper intake quintile with the lowest quintile
  • A forty-three per cent reduced risk (95% CI:0.34,0.97) of premenopausal breast cancer was found among those whose soy food intake during adolescence was in the highest intake group, although the test of linear trend was only of borderline significance (P=0.061)
  • Women who consumed a high amount of soy protein intake in both adolescence and adulthood had the greatest decrease in relative risk compared with women who had a low intake of soy protein at both at these time points (RR=0.41; 95% CI:0.22, 0.75; RR=0.62; 95% CI:0.33,1.19, respectively)
  • No significant association with soy food consumption was found for postmenopausal breast cancer

Other Findings

  • No apparent association was found for soy food intake and menstrual and reproductive characteristics
  • Women with high intake of soy protein had also higher intakes of total energy, fruit, vegetables and meat than those with low soy consumption
  • In postmenopausal women, a slight positive association was observed between breast cancer risk and adolescent soy food intake; RR=1.38; 95%CI: 1,00,1.91; P=0.038


Author Conclusion:

This large, population-based, prospective cohort study provides strong evidence of a protective effect of soy food intake against premenopausal breast cancer.

Reviewer Comments:
  • Large population-based cohort study
  • Response rate was around 92%
  • Soy intake was measured twice including a following-up of approximately 2 years
  • Recruitment of adolescents and following-up were not described making unclear the outcomes related to soy exposure during adolescence and breast cancer risk during postmenopausal
  • Risk of random error of assessing dietary intake using questionnaire
  • Limitations recognized by authors: high soy food intake could be related to certain lifestyles that may be associated with reduced risks of breast cancer; statistical power for the study was low for some subgroup analyses

Research Design and Implementation Criteria Checklist: Primary Research
Relevance Questions
  1. Would implementing the studied intervention or procedure (if found successful) result in improved outcomes for the patients/clients/population group? (Not Applicable for some epidemiological studies)
  2. Did the authors study an outcome (dependent variable) or topic that the patients/clients/population group would care about?
  3. Is the focus of the intervention or procedure (independent variable) or topic of study a common issue of concern to nutrition or dietetics practice?
  4. Is the intervention or procedure feasible? (NA for some epidemiological studies)
Validity Questions
1. Was the research question clearly stated?
  1.1. Was (were) the specific intervention(s) or procedure(s) [independent variable(s)] identified?
  1.2. Was (were) the outcome(s) [dependent variable(s)] clearly indicated?
  1.3. Were the target population and setting specified?
2. Was the selection of study subjects/patients free from bias?
  2.1. Were inclusion/exclusion criteria specified (e.g., risk, point in disease progression, diagnostic or prognosis criteria), and with sufficient detail and without omitting criteria critical to the study?
  2.2. Were criteria applied equally to all study groups?
  2.3. Were health, demographics, and other characteristics of subjects described?
  2.4. Were the subjects/patients a representative sample of the relevant population?
3. Were study groups comparable?
  3.1. Was the method of assigning subjects/patients to groups described and unbiased? (Method of randomization identified if RCT)
  3.2. Were distribution of disease status, prognostic factors, and other factors (e.g., demographics) similar across study groups at baseline?
  3.3. Were concurrent controls used? (Concurrent preferred over historical controls.)
  3.4. If cohort study or cross-sectional study, were groups comparable on important confounding factors and/or were preexisting differences accounted for by using appropriate adjustments in statistical analysis?
  3.5. If case control or cross-sectional study, were potential confounding factors comparable for cases and controls? (If case series or trial with subjects serving as own control, this criterion is not applicable. Criterion may not be applicable in some cross-sectional studies.)
  3.6. If diagnostic test, was there an independent blind comparison with an appropriate reference standard (e.g., "gold standard")?
4. Was method of handling withdrawals described?
  4.1. Were follow-up methods described and the same for all groups?
  4.2. Was the number, characteristics of withdrawals (i.e., dropouts, lost to follow up, attrition rate) and/or response rate (cross-sectional studies) described for each group? (Follow up goal for a strong study is 80%.)
  4.3. Were all enrolled subjects/patients (in the original sample) accounted for?
  4.4. Were reasons for withdrawals similar across groups?
  4.5. If diagnostic test, was decision to perform reference test not dependent on results of test under study?
5. Was blinding used to prevent introduction of bias?
  5.1. In intervention study, were subjects, clinicians/practitioners, and investigators blinded to treatment group, as appropriate?
  5.2. Were data collectors blinded for outcomes assessment? (If outcome is measured using an objective test, such as a lab value, this criterion is assumed to be met.)
  5.3. In cohort study or cross-sectional study, were measurements of outcomes and risk factors blinded?
  5.4. In case control study, was case definition explicit and case ascertainment not influenced by exposure status?
  5.5. In diagnostic study, were test results blinded to patient history and other test results?
6. Were intervention/therapeutic regimens/exposure factor or procedure and any comparison(s) described in detail? Were interveningfactors described?
  6.1. In RCT or other intervention trial, were protocols described for all regimens studied?
  6.2. In observational study, were interventions, study settings, and clinicians/provider described?
  6.3. Was the intensity and duration of the intervention or exposure factor sufficient to produce a meaningful effect?
  6.4. Was the amount of exposure and, if relevant, subject/patient compliance measured?
  6.5. Were co-interventions (e.g., ancillary treatments, other therapies) described?
  6.6. Were extra or unplanned treatments described?
  6.7. Was the information for 6.4, 6.5, and 6.6 assessed the same way for all groups?
  6.8. In diagnostic study, were details of test administration and replication sufficient?
7. Were outcomes clearly defined and the measurements valid and reliable?
  7.1. Were primary and secondary endpoints described and relevant to the question?
  7.2. Were nutrition measures appropriate to question and outcomes of concern?
  7.3. Was the period of follow-up long enough for important outcome(s) to occur?
  7.4. Were the observations and measurements based on standard, valid, and reliable data collection instruments/tests/procedures?
  7.5. Was the measurement of effect at an appropriate level of precision?
  7.6. Were other factors accounted for (measured) that could affect outcomes?
  7.7. Were the measurements conducted consistently across groups?
8. Was the statistical analysis appropriate for the study design and type of outcome indicators?
  8.1. Were statistical analyses adequately described and the results reported appropriately?
  8.2. Were correct statistical tests used and assumptions of test not violated?
  8.3. Were statistics reported with levels of significance and/or confidence intervals?
  8.4. Was "intent to treat" analysis of outcomes done (and as appropriate, was there an analysis of outcomes for those maximally exposed or a dose-response analysis)?
  8.5. Were adequate adjustments made for effects of confounding factors that might have affected the outcomes (e.g., multivariate analyses)?
  8.6. Was clinical significance as well as statistical significance reported?
  8.7. If negative findings, was a power calculation reported to address type 2 error?
9. Are conclusions supported by results with biases and limitations taken into consideration?
  9.1. Is there a discussion of findings?
  9.2. Are biases and study limitations identified and discussed?
10. Is bias due to study’s funding or sponsorship unlikely?
  10.1. Were sources of funding and investigators’ affiliations described?
  10.2. Was the study free from apparent conflict of interest?

Copyright American Dietetic Association (ADA).