Frequently Asked Questions

1. What is the Agriculture & Food Systems Institute Crop Composition Database (AFSI CCDB)?

The AFSI CCDB is a project of the Agriculture & Food Systems Institute. In May 2003, the first version of the database was released as a comprehensive, publicly available source of information on the natural variability in crop nutritional composition. The current version of the database (Version 10.0) represents a compilation of the results of component analyses of non-genetically engineered (also referred to as non-genetically modified) apple (Malus domestica), canola (Brassica juncea), canola (Brassica napus), field corn (Zea mays), sweet corn (Zea mays), cassava (Manihot esculenta), cotton (Gossypium hirsutum), eucalyptus, mustard (Brassica juncea), potato (Solanum tuberosum), red pepper (Capsicum annum), rice (Oryza sativa), sorghum (Sorghum bicolor), soybean (Glycine max), strawberry (Fragaria ananassa), sugar beet (Beta vulgaris), and sugarcane (Saccharum officinarum) submitted by public and private sector organizations engaged in agricultural sciences. Currently, the database contains 1.5 million data points representing 231 compositional components (analytes).

2. What is this database used for?

The database is accessible to scientists from academia, government agencies, industry, and to the general public. According to Google Analytics, in 2024 the Agriculture & Food Systems Institute Crop Composition Database (AFSI CCDB) logged over 48,190 views by 6,900 users from 155 countries. A search of the data can yield reports of average nutritional component levels and ranges of those components in seed, forage, and other plant matrices. Filters can be applied to retrieve selected subsets of data such as those obtained from samples collected in a specific year or at a specific location (primary search criteria), or those generated using a specific analytical method (analyte filters).

3. How is the data included in this database generated?

Data are derived from samples of numerous crop varieties from multiple seasons and are produced in controlled field trials using standard commercial practices at various locations throughout the world. Representative plant samples are obtained from field-grown crops with known production locations and dates. Analytical methods used in the generation of the data are validated. You can view additional information regarding data acceptance criteria in the section about CCDB.

4. When will new data and/or new crops be added to the AFSI CCDB in the future?

Additional data for existing crops will be uploaded to the database by the data providers as it becomes available, and a new version of the publicly-accessible database will be published once a sufficient amount of new data has been uploaded and validated. New crops will also be added to the database as enough data becomes available. For information about providing data for inclusion in future version releases of the AFSI CCDB, please contact ccdb@foodsystems.org.

5. How can I get more information?

The following publications are a good source of information about the database:

Ridley, W.P., Shillito, R.D., Coats, I., Steiner, H-Y, Shawgo, M., Phillips, A., Dussold, P., Kurtyka, L., 2004. Development of the International Life Sciences Institute Crop Composition Database. Journal of Food Composition and Analysis 17: 423-438.

Alba, R., Phillips, A., Mackie, S., Gillikin, N., Maxwell, C., Brune, P., Ridley, W., Fitzpatrick, J., Levine, M., Harris, S., 2010. Improvements to the International Life Sciences Institute Crop Composition Database. Journal of Food Composition and Analysis 23:741-748.

Sult T, Barthet V, Bennett L, Edwards A, Fast B, Gillikin N, Launis K, New S, Rogers-Szuma K, Sabbatini J, Srinivasan J, Tilton G, Venkatesh TV 2016. Release of the International Life Sciences Institute Crop Composition Database Version 5. Journal of Food Composition and Analysis 51:106-111.

6. How do I get updates about the database?

To receive periodic updates about AFSI and AFSI CCDB, register at Register for Updates.

7. How do I print my data output reports?

There are three Output Formats: HTML, Adobe Acrobat (PDF) and Comma Delimited File (CSV).

To print the HTML format, select Output Format: HTML, click Run Report, click on the report and press [Ctrl] [A], then right click and select Print.
To print the PDF format, select Output Format: Adobe Acrobat (PDF), click Run Report. The report will download to your computer. Open the file and print the report.
To print the CSV format, select Output Format: Comma Delimited File (CSV)), click Run Report. The report will download to your computer. Open the file and print the report.

8. How can I get more help running a search and generating a report?

Please view the videos found on the Help page for navigating the search function and generating reports. The About CCDB page and Frequently Asked Questions will provide more information on the database. If you have additional concerns, contact ccdb@foodsystems.org.

9. Which analytes are included in the database and are all analytes important for the assessment of a crop?

The Analyte Types with Associated Analytes on the About CCDB page is a list of all the analytes in the AFSI CCDB. This does not in any way imply that all reported analytes are important in the assessment of a particular crop. The Organisation for Economic Co-operation and Development (OECD) Consensus Documents for the work on the Safety of Novel Foods and Feeds: Plants for the composition analysis of a specific crop are typically referenced when determining analytes that are of biological relevance. There are some instances where data has been provided for analytes not included in the relevant OECD guidance documents.

10. How are minimum, maximum, and mean values calculated for an analyte when a subset of the values for said analyte are below the limit of quantitation (LOQ)?

Minimum, maximum, and mean values are derived only from data that is above the LOQ for the analytical method used. For a more comprehensive understanding of the distribution within a particular dataset, the option to report the number of samples below the LOQ can be selected in the report options.

11. Is there a relationship between the number of decimal places and the precision for an analytical method?

The number of decimal places displayed in an output report reflects the number of decimal places defined for that unit of measure and does not imply a specific precision for an analytical result. Trailing zeros (to the right of a decimal) in the database outputs also reflect the number of decimal places in the submitted raw data and do not imply a specific precision for an analytical result. Exact numbers (e.g., conversion factors, such as 1000 mg = 1 g, 1g = 1000000 ug) do not affect the certainty of a calculation and are ignored when determining the number of decimal places in the output of database calculations. Rounding of numbers is used only for reporting/output purposes and was deferred until all calculations were made. Users should also note that spreadsheet programs typically remove trailing zeros from data values.

12. Why are the sample numbers (N) different for the same analyte, in the same crop and tissue, when I compare data for fresh weight (% FW) and dry weight (% DW)?

In order to convert values from % FW to % DW or vice versa, submitted data must include a corresponding moisture value for each sample. There are some instances in which a value for an analyte was submitted as % FW or % DW, but moisture data for that sample was not available. In these cases, conversion to some units of measure is not possible.

13. Why are the sample numbers (N) different for the same fatty acid (FA), in the same crop and tissue, when I compare data for % Total FA and % FW?

Some of the data for fatty acids were submitted only with the units of % Total FA. Because these data have no reference to sample weight, % Total FA values cannot be converted to other weight-based units (e.g. % FW or mg/g). Therefore, these particular samples do not contribute to the sample number (N) that is observed when a user requests fatty acid data in the units of % FW.

14. The analyte 'Total Fat' was included in Version 4.2, but not in Version 5.0?

In the AFSI CCDB Version 5.0, the analyte Total Fat was renamed to Crude Fat, which more accurately describes the analytical methods used for extractable fat with gravimetric determination. Total Fat has various meanings globally, is usually associated with nutritional labeling terminology, and is typically calculated based on fatty acid content. Values for Total Fat in Version 4.2 are included in Version 5.0 under Crude Fat.

15. What is the difference between the soybean seed analytes Lectins and Soybean Lectin?

Values for the analyte Lectins were determined using a hemagglutination assay described by Liener et al (Archives of Biochemistry and Biophysics, 54:223-231(1955)). This assay takes advantage of the agglutination properties of lectins as a class of proteins. The hemagglutination technique requires the use of rabbit red blood cells; the results are defined as the level of standard red blood cells causing 50% suspension to sediment in 2.5 hours and are presented in hemagglutinating units (H.U.). This technique is labor-intensive, and variation between the sources, production, preparation of red blood cells, and the laboratories conducting the analyses can contribute to variability in the values generated. The hemagglutination assay is not specific to soybean lectin. Soybean Lectin, also known as soybean agglutinin (SBA), is a tetrameric protein constituted by 30kDa subunits with binding specificity towards galactose (Gal) and N-acetylgalactosamine (Gal-NAc) as described by Lotan et al (The Journal of Biological Chemistry, 249:1219-1224 (1974)). Soybean Lectin values added to the Crop Composition Database Version 5.0 were determined using a newly developed and validated Enzyme-Linked-Lectin-Assay (ELLA) which demonstrates acceptable sensitivity, precision, accuracy, and specificity, and is more efficient than the hemagglutination assay as described by Wang et al (Food Chemistry, 113:1218-1225 (2009)) and Breeze et al (Communicated to Journal of American Oil Chemists Society (2014)). Using the ELLA method, Soybean Lectin values are reported on weight basis similar to other analytes reported in the database.

16. Why are forage data displayed only in dry weight?

Data are from forage samples that were either frozen immediately after collection or oven dried prior to freezing. The moisture content of the samples frozen immediately is higher than the moisture content of the oven dried samples; therefore, fresh weight values are not comparable between the two sample types. To compensate for differences in moisture content and provide comparable values, forage data are displayed only on a dry weight basis.

17. Why are field corn and sweet corn classified as separate crops in the database when both are Zea mays?

The principal use of corn (Zea mays) is grain for animal feed (field corn), however any type of corn may be used as vegetable corn by eating the kernels when they are immature. Modern sweet corn cultivars are developed specifically for use as a vegetable and are based on at least eight endogenous genes that, used singly or in combination, result in increased sugar content and decreased starch content in the endosperm, and also impact the eating quality (flavor, aroma, texture and tenderness) and visual appearance (kernel color and shape, and ear shape). (Marshall SW and Tracy WF 2007) Therefore, because there are distinct differences in composition between field corn (Zea mays) and sweet corn (Zea mays), and because there is a distinct difference in the growth stage of the consumed product (mature vs. immature kernels), the two crops are classified separately in the database.(Marshall SW and Tracy WF. 2007. Sweet Corn. In Corn Chemistry and Technology. White PJ and Johnson LA, eds. St. Paul, Minnesota, USA: American Association of Cereal Chemists, Inc. pp. 537-569)

18. Why is the Analyte ‘Carbohydrate by Calculation’ listed under Analyte Type ‘Proximates’ and not under ‘Carbohydrates’?

The major constituents in food include protein, fat, moisture, ash and carbohydrates, that are collectively termed as ‘Proximates’. ‘Carbohydrate by Calculation’ is a calculated parameter or a derived value that is not obtained by analysis, but is defined as the difference between 100 and the sum of the crude protein, fat, moisture, and ash [100 - (crude protein + fat + moisture + ash)], according to the USDA Agriculture Handbook No 74. (1973). In the AFSI CCDB, ‘Carbohydrate by Calculation’ is therefore listed under the Analyte Type ‘Proximates’. This represents estimated ‘Total Carbohydrate’ which includes all carbohydrates, fiber and organic acids.
Another term ‘Available Carbohydrate’, that has not been used in the CCDB, represents only a fraction of the ‘Total Carbohydrate’ and is defined as the sum of free sugars (glucose, fructose, sucrose, lactose, maltose), starch, dextrins and glycogen. ‘Available Carbohydrate’ does not include dietary fiber, (soluble or insoluble), which is obtained by direct analysis. Dietary Fiber is listed under Analyte Type ‘Fiber’ in the AFSI CCDB.
In the AFSI CCDB, under Analyte Type ‘Carbohydrates’, sugar and starch analytes are listed for different crop types.

19. Under “Crop Type” why is there an option to select a single species or variety or a combination of multiple species or varieties for canola, mustard, and apple?

For canola and mustard there is more than one species in the database and for apple there is more than one variety in the database. The user may want to output data for all species or varieties combined or to output data for only one individual species or variety. For example, the user can select “Canola - Brassica napus” or “Canola - Brassica juncea” and output the data separately. However, if the user is interested in outputting the data for both species of canola, the combination “Canola - Brassica juncea + Canola - Brassica napus” may be used. Without this feature, the user would need to output the data sets for each individual species and merge them.

20. Are results of mono-unsaturated fatty acid analyses isomer specific?

The names associated with mono-unsaturated fatty acids (MUFA) in AFSI CCDB generally reflect the presence of a single isomer (n-9). For most crops in the database, the n-9 isomer of 18:1 through 24:1 MUFAs is the only isomer that is present in the crop tissue and captured in the database. However, for Brassica species (e.g., Brassica napus, Brassica juncea), both the predominant n-9 isomer, and the n-7 isomer are present and comprise the single value reported for 18:1, 20:1, and 22:1 MUFAs. In practice, this means that, for canola (B. napus or B. juncea) or mustard (B. juncea) seeds in AFSI CCDB, the value for 18:1 oleic represents the sum of both the n-7 (cis-vaccenic) and n-9 (oleic) isomers. Similarly, 20:1 eicosenoic, represents the sum of 20:1 n-7 and 20:1 n-9 fatty acids; 22:1 erucic represents the sum of 22:1 n-7 and 22:1 n-9 (erucic). Additional information on the presence of n-7 and n-9 MUFAs in Brassica species may be found in the published literature (Barthet V, 2008).

Reference: Barthet V. 2008. (n-7) and (n-9) cis-monounsaturated fatty acid contents of 12 Brassica species. Phytochemistry. 69: 411-7.