• Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • QuestionPro

survey software icon

  • Solutions Industries Gaming Automotive Sports and events Education Government Travel & Hospitality Financial Services Healthcare Cannabis Technology Use Case NPS+ Communities Audience Contactless surveys Mobile LivePolls Member Experience GDPR Positive People Science 360 Feedback Surveys
  • Resources Blog eBooks Survey Templates Case Studies Training Help center

analysis in research methodology

Home Market Research

Data Analysis in Research: Types & Methods


Content Index

Why analyze data in research?

Types of data in research, finding patterns in the qualitative data, methods used for data analysis in qualitative research, preparing data for analysis, methods used for data analysis in quantitative research, considerations in research data analysis, what is data analysis in research.

Definition of research in data analysis: According to LeCompte and Schensul, research data analysis is a process used by researchers to reduce data to a story and interpret it to derive insights. The data analysis process helps reduce a large chunk of data into smaller fragments, which makes sense. 

Three essential things occur during the data analysis process — the first is data organization . Summarization and categorization together contribute to becoming the second known method used for data reduction. It helps find patterns and themes in the data for easy identification and linking. The third and last way is data analysis – researchers do it in both top-down and bottom-up fashion.

LEARN ABOUT: Research Process Steps

On the other hand, Marshall and Rossman describe data analysis as a messy, ambiguous, and time-consuming but creative and fascinating process through which a mass of collected data is brought to order, structure and meaning.

We can say that “the data analysis and data interpretation is a process representing the application of deductive and inductive logic to the research and data analysis.”

Researchers rely heavily on data as they have a story to tell or research problems to solve. It starts with a question, and data is nothing but an answer to that question. But, what if there is no question to ask? Well! It is possible to explore data even without a problem – we call it ‘Data Mining’, which often reveals some interesting patterns within the data that are worth exploring.

Irrelevant to the type of data researchers explore, their mission and audiences’ vision guide them to find the patterns to shape the story they want to tell. One of the essential things expected from researchers while analyzing data is to stay open and remain unbiased toward unexpected patterns, expressions, and results. Remember, sometimes, data analysis tells the most unforeseen yet exciting stories that were not expected when initiating data analysis. Therefore, rely on the data you have at hand and enjoy the journey of exploratory research. 

Create a Free Account

Every kind of data has a rare quality of describing things after assigning a specific value to it. For analysis, you need to organize these values, processed and presented in a given context, to make it useful. Data can be in different forms; here are the primary data types.

  • Qualitative data: When the data presented has words and descriptions, then we call it qualitative data . Although you can observe this data, it is subjective and harder to analyze data in research, especially for comparison. Example: Quality data represents everything describing taste, experience, texture, or an opinion that is considered quality data. This type of data is usually collected through focus groups, personal qualitative interviews , qualitative observation or using open-ended questions in surveys.
  • Quantitative data: Any data expressed in numbers of numerical figures are called quantitative data . This type of data can be distinguished into categories, grouped, measured, calculated, or ranked. Example: questions such as age, rank, cost, length, weight, scores, etc. everything comes under this type of data. You can present such data in graphical format, charts, or apply statistical analysis methods to this data. The (Outcomes Measurement Systems) OMS questionnaires in surveys are a significant source of collecting numeric data.
  • Categorical data: It is data presented in groups. However, an item included in the categorical data cannot belong to more than one group. Example: A person responding to a survey by telling his living style, marital status, smoking habit, or drinking habit comes under the categorical data. A chi-square test is a standard method used to analyze this data.

Learn More : Examples of Qualitative Data in Education

Data analysis in qualitative research

Data analysis and qualitative data research work a little differently from the numerical data as the quality data is made up of words, descriptions, images, objects, and sometimes symbols. Getting insight from such complicated information is a complicated process. Hence it is typically used for exploratory research and data analysis .

Although there are several ways to find patterns in the textual information, a word-based method is the most relied and widely used global technique for research and data analysis. Notably, the data analysis process in qualitative research is manual. Here the researchers usually read the available data and find repetitive or commonly used words. 

For example, while studying data collected from African countries to understand the most pressing issues people face, researchers might find  “food”  and  “hunger” are the most commonly used words and will highlight them for further analysis.

LEARN ABOUT: Level of Analysis

The keyword context is another widely used word-based technique. In this method, the researcher tries to understand the concept by analyzing the context in which the participants use a particular keyword.  

For example , researchers conducting research and data analysis for studying the concept of ‘diabetes’ amongst respondents might analyze the context of when and how the respondent has used or referred to the word ‘diabetes.’

The scrutiny-based technique is also one of the highly recommended  text analysis  methods used to identify a quality data pattern. Compare and contrast is the widely used method under this technique to differentiate how a specific text is similar or different from each other. 

For example: To find out the “importance of resident doctor in a company,” the collected data is divided into people who think it is necessary to hire a resident doctor and those who think it is unnecessary. Compare and contrast is the best method that can be used to analyze the polls having single-answer questions types .

Metaphors can be used to reduce the data pile and find patterns in it so that it becomes easier to connect data with theory.

Variable Partitioning is another technique used to split variables so that researchers can find more coherent descriptions and explanations from the enormous data.

LEARN ABOUT: Qualitative Research Questions and Questionnaires

There are several techniques to analyze the data in qualitative research, but here are some commonly used methods,

  • Content Analysis:  It is widely accepted and the most frequently employed technique for data analysis in research methodology. It can be used to analyze the documented information from text, images, and sometimes from the physical items. It depends on the research questions to predict when and where to use this method.
  • Narrative Analysis: This method is used to analyze content gathered from various sources such as personal interviews, field observation, and  surveys . The majority of times, stories, or opinions shared by people are focused on finding answers to the research questions.
  • Discourse Analysis:  Similar to narrative analysis, discourse analysis is used to analyze the interactions with people. Nevertheless, this particular method considers the social context under which or within which the communication between the researcher and respondent takes place. In addition to that, discourse analysis also focuses on the lifestyle and day-to-day environment while deriving any conclusion.
  • Grounded Theory:  When you want to explain why a particular phenomenon happened, then using grounded theory for analyzing quality data is the best resort. Grounded theory is applied to study data about the host of similar cases occurring in different settings. When researchers are using this method, they might alter explanations or produce new ones until they arrive at some conclusion.

LEARN ABOUT: 12 Best Tools for Researchers

Data analysis in quantitative research

The first stage in research and data analysis is to make it for the analysis so that the nominal data can be converted into something meaningful. Data preparation consists of the below phases.

Phase I: Data Validation

Data validation is done to understand if the collected data sample is per the pre-set standards, or it is a biased data sample again divided into four different stages

  • Fraud: To ensure an actual human being records each response to the survey or the questionnaire
  • Screening: To make sure each participant or respondent is selected or chosen in compliance with the research criteria
  • Procedure: To ensure ethical standards were maintained while collecting the data sample
  • Completeness: To ensure that the respondent has answered all the questions in an online survey. Else, the interviewer had asked all the questions devised in the questionnaire.

Phase II: Data Editing

More often, an extensive research data sample comes loaded with errors. Respondents sometimes fill in some fields incorrectly or sometimes skip them accidentally. Data editing is a process wherein the researchers have to confirm that the provided data is free of such errors. They need to conduct necessary checks and outlier checks to edit the raw edit and make it ready for analysis.

Phase III: Data Coding

Out of all three, this is the most critical phase of data preparation associated with grouping and assigning values to the survey responses . If a survey is completed with a 1000 sample size, the researcher will create an age bracket to distinguish the respondents based on their age. Thus, it becomes easier to analyze small data buckets rather than deal with the massive data pile.

LEARN ABOUT: Steps in Qualitative Research

After the data is prepared for analysis, researchers are open to using different research and data analysis methods to derive meaningful insights. For sure, statistical analysis plans are the most favored to analyze numerical data. In statistical analysis, distinguishing between categorical data and numerical data is essential, as categorical data involves distinct categories or labels, while numerical data consists of measurable quantities. The method is again classified into two groups. First, ‘Descriptive Statistics’ used to describe data. Second, ‘Inferential statistics’ that helps in comparing the data .

Descriptive statistics

This method is used to describe the basic features of versatile types of data in research. It presents the data in such a meaningful way that pattern in the data starts making sense. Nevertheless, the descriptive analysis does not go beyond making conclusions. The conclusions are again based on the hypothesis researchers have formulated so far. Here are a few major types of descriptive analysis methods.

Measures of Frequency

  • Count, Percent, Frequency
  • It is used to denote home often a particular event occurs.
  • Researchers use it when they want to showcase how often a response is given.

Measures of Central Tendency

  • Mean, Median, Mode
  • The method is widely used to demonstrate distribution by various points.
  • Researchers use this method when they want to showcase the most commonly or averagely indicated response.

Measures of Dispersion or Variation

  • Range, Variance, Standard deviation
  • Here the field equals high/low points.
  • Variance standard deviation = difference between the observed score and mean
  • It is used to identify the spread of scores by stating intervals.
  • Researchers use this method to showcase data spread out. It helps them identify the depth until which the data is spread out that it directly affects the mean.

Measures of Position

  • Percentile ranks, Quartile ranks
  • It relies on standardized scores helping researchers to identify the relationship between different scores.
  • It is often used when researchers want to compare scores with the average count.

For quantitative research use of descriptive analysis often give absolute numbers, but the in-depth analysis is never sufficient to demonstrate the rationale behind those numbers. Nevertheless, it is necessary to think of the best method for research and data analysis suiting your survey questionnaire and what story researchers want to tell. For example, the mean is the best way to demonstrate the students’ average scores in schools. It is better to rely on the descriptive statistics when the researchers intend to keep the research or outcome limited to the provided  sample  without generalizing it. For example, when you want to compare average voting done in two different cities, differential statistics are enough.

Descriptive analysis is also called a ‘univariate analysis’ since it is commonly used to analyze a single variable.

Inferential statistics

Inferential statistics are used to make predictions about a larger population after research and data analysis of the representing population’s collected sample. For example, you can ask some odd 100 audiences at a movie theater if they like the movie they are watching. Researchers then use inferential statistics on the collected  sample  to reason that about 80-90% of people like the movie. 

Here are two significant areas of inferential statistics.

  • Estimating parameters: It takes statistics from the sample research data and demonstrates something about the population parameter.
  • Hypothesis test: I t’s about sampling research data to answer the survey research questions. For example, researchers might be interested to understand if the new shade of lipstick recently launched is good or not, or if the multivitamin capsules help children to perform better at games.

These are sophisticated analysis methods used to showcase the relationship between different variables instead of describing a single variable. It is often used when researchers want something beyond absolute numbers to understand the relationship between variables.

Here are some of the commonly used methods for data analysis in research.

  • Correlation: When researchers are not conducting experimental research or quasi-experimental research wherein the researchers are interested to understand the relationship between two or more variables, they opt for correlational research methods.
  • Cross-tabulation: Also called contingency tables,  cross-tabulation  is used to analyze the relationship between multiple variables.  Suppose provided data has age and gender categories presented in rows and columns. A two-dimensional cross-tabulation helps for seamless data analysis and research by showing the number of males and females in each age category.
  • Regression analysis: For understanding the strong relationship between two variables, researchers do not look beyond the primary and commonly used regression analysis method, which is also a type of predictive analysis used. In this method, you have an essential factor called the dependent variable. You also have multiple independent variables in regression analysis. You undertake efforts to find out the impact of independent variables on the dependent variable. The values of both independent and dependent variables are assumed as being ascertained in an error-free random manner.
  • Frequency tables: The statistical procedure is used for testing the degree to which two or more vary or differ in an experiment. A considerable degree of variation means research findings were significant. In many contexts, ANOVA testing and variance analysis are similar.
  • Analysis of variance: The statistical procedure is used for testing the degree to which two or more vary or differ in an experiment. A considerable degree of variation means research findings were significant. In many contexts, ANOVA testing and variance analysis are similar.
  • Researchers must have the necessary research skills to analyze and manipulation the data , Getting trained to demonstrate a high standard of research practice. Ideally, researchers must possess more than a basic understanding of the rationale of selecting one statistical method over the other to obtain better data insights.
  • Usually, research and data analytics projects differ by scientific discipline; therefore, getting statistical advice at the beginning of analysis helps design a survey questionnaire, select data collection methods , and choose samples.

LEARN ABOUT: Best Data Collection Tools

  • The primary aim of data research and analysis is to derive ultimate insights that are unbiased. Any mistake in or keeping a biased mind to collect data, selecting an analysis method, or choosing  audience  sample il to draw a biased inference.
  • Irrelevant to the sophistication used in research data and analysis is enough to rectify the poorly defined objective outcome measurements. It does not matter if the design is at fault or intentions are not clear, but lack of clarity might mislead readers, so avoid the practice.
  • The motive behind data analysis in research is to present accurate and reliable data. As far as possible, avoid statistical errors, and find a way to deal with everyday challenges like outliers, missing data, data altering, data mining , or developing graphical representation.

LEARN MORE: Descriptive Research vs Correlational Research The sheer amount of data generated daily is frightening. Especially when data analysis has taken center stage. in 2018. In last year, the total data supply amounted to 2.8 trillion gigabytes. Hence, it is clear that the enterprises willing to survive in the hypercompetitive world must possess an excellent capability to analyze complex research data, derive actionable insights, and adapt to the new market needs.

LEARN ABOUT: Average Order Value

QuestionPro is an online survey platform that empowers organizations in data analysis and research and provides them a medium to collect data by creating appealing surveys.


email survey tool

The Best Email Survey Tool to Boost Your Feedback Game

May 7, 2024

Employee Engagement Survey Tools

Top 10 Employee Engagement Survey Tools

employee engagement software

Top 20 Employee Engagement Software Solutions

May 3, 2024

customer experience software

15 Best Customer Experience Software of 2024

May 2, 2024

Other categories

  • Academic Research
  • Artificial Intelligence
  • Assessments
  • Brand Awareness
  • Case Studies
  • Communities
  • Consumer Insights
  • Customer effort score
  • Customer Engagement
  • Customer Experience
  • Customer Loyalty
  • Customer Research
  • Customer Satisfaction
  • Employee Benefits
  • Employee Engagement
  • Employee Retention
  • Friday Five
  • General Data Protection Regulation
  • Insights Hub
  • Life@QuestionPro
  • Market Research
  • Mobile diaries
  • Mobile Surveys
  • New Features
  • Online Communities
  • Question Types
  • Questionnaire
  • QuestionPro Products
  • Release Notes
  • Research Tools and Apps
  • Revenue at Risk
  • Survey Templates
  • Training Tips
  • Uncategorized
  • Video Learning Series
  • What’s Coming Up
  • Workforce Intelligence
  • Open access
  • Published: 07 September 2020

A tutorial on methodological studies: the what, when, how and why

  • Lawrence Mbuagbaw   ORCID: orcid.org/0000-0001-5855-5461 1 , 2 , 3 ,
  • Daeria O. Lawson 1 ,
  • Livia Puljak 4 ,
  • David B. Allison 5 &
  • Lehana Thabane 1 , 2 , 6 , 7 , 8  

BMC Medical Research Methodology volume  20 , Article number:  226 ( 2020 ) Cite this article

38k Accesses

53 Citations

58 Altmetric

Metrics details

Methodological studies – studies that evaluate the design, analysis or reporting of other research-related reports – play an important role in health research. They help to highlight issues in the conduct of research with the aim of improving health research methodology, and ultimately reducing research waste.

We provide an overview of some of the key aspects of methodological studies such as what they are, and when, how and why they are done. We adopt a “frequently asked questions” format to facilitate reading this paper and provide multiple examples to help guide researchers interested in conducting methodological studies. Some of the topics addressed include: is it necessary to publish a study protocol? How to select relevant research reports and databases for a methodological study? What approaches to data extraction and statistical analysis should be considered when conducting a methodological study? What are potential threats to validity and is there a way to appraise the quality of methodological studies?

Appropriate reflection and application of basic principles of epidemiology and biostatistics are required in the design and analysis of methodological studies. This paper provides an introduction for further discussion about the conduct of methodological studies.

Peer Review reports

The field of meta-research (or research-on-research) has proliferated in recent years in response to issues with research quality and conduct [ 1 , 2 , 3 ]. As the name suggests, this field targets issues with research design, conduct, analysis and reporting. Various types of research reports are often examined as the unit of analysis in these studies (e.g. abstracts, full manuscripts, trial registry entries). Like many other novel fields of research, meta-research has seen a proliferation of use before the development of reporting guidance. For example, this was the case with randomized trials for which risk of bias tools and reporting guidelines were only developed much later – after many trials had been published and noted to have limitations [ 4 , 5 ]; and for systematic reviews as well [ 6 , 7 , 8 ]. However, in the absence of formal guidance, studies that report on research differ substantially in how they are named, conducted and reported [ 9 , 10 ]. This creates challenges in identifying, summarizing and comparing them. In this tutorial paper, we will use the term methodological study to refer to any study that reports on the design, conduct, analysis or reporting of primary or secondary research-related reports (such as trial registry entries and conference abstracts).

In the past 10 years, there has been an increase in the use of terms related to methodological studies (based on records retrieved with a keyword search [in the title and abstract] for “methodological review” and “meta-epidemiological study” in PubMed up to December 2019), suggesting that these studies may be appearing more frequently in the literature. See Fig.  1 .

figure 1

Trends in the number studies that mention “methodological review” or “meta-

epidemiological study” in PubMed.

The methods used in many methodological studies have been borrowed from systematic and scoping reviews. This practice has influenced the direction of the field, with many methodological studies including searches of electronic databases, screening of records, duplicate data extraction and assessments of risk of bias in the included studies. However, the research questions posed in methodological studies do not always require the approaches listed above, and guidance is needed on when and how to apply these methods to a methodological study. Even though methodological studies can be conducted on qualitative or mixed methods research, this paper focuses on and draws examples exclusively from quantitative research.

The objectives of this paper are to provide some insights on how to conduct methodological studies so that there is greater consistency between the research questions posed, and the design, analysis and reporting of findings. We provide multiple examples to illustrate concepts and a proposed framework for categorizing methodological studies in quantitative research.

What is a methodological study?

Any study that describes or analyzes methods (design, conduct, analysis or reporting) in published (or unpublished) literature is a methodological study. Consequently, the scope of methodological studies is quite extensive and includes, but is not limited to, topics as diverse as: research question formulation [ 11 ]; adherence to reporting guidelines [ 12 , 13 , 14 ] and consistency in reporting [ 15 ]; approaches to study analysis [ 16 ]; investigating the credibility of analyses [ 17 ]; and studies that synthesize these methodological studies [ 18 ]. While the nomenclature of methodological studies is not uniform, the intents and purposes of these studies remain fairly consistent – to describe or analyze methods in primary or secondary studies. As such, methodological studies may also be classified as a subtype of observational studies.

Parallel to this are experimental studies that compare different methods. Even though they play an important role in informing optimal research methods, experimental methodological studies are beyond the scope of this paper. Examples of such studies include the randomized trials by Buscemi et al., comparing single data extraction to double data extraction [ 19 ], and Carrasco-Labra et al., comparing approaches to presenting findings in Grading of Recommendations, Assessment, Development and Evaluations (GRADE) summary of findings tables [ 20 ]. In these studies, the unit of analysis is the person or groups of individuals applying the methods. We also direct readers to the Studies Within a Trial (SWAT) and Studies Within a Review (SWAR) programme operated through the Hub for Trials Methodology Research, for further reading as a potential useful resource for these types of experimental studies [ 21 ]. Lastly, this paper is not meant to inform the conduct of research using computational simulation and mathematical modeling for which some guidance already exists [ 22 ], or studies on the development of methods using consensus-based approaches.

When should we conduct a methodological study?

Methodological studies occupy a unique niche in health research that allows them to inform methodological advances. Methodological studies should also be conducted as pre-cursors to reporting guideline development, as they provide an opportunity to understand current practices, and help to identify the need for guidance and gaps in methodological or reporting quality. For example, the development of the popular Preferred Reporting Items of Systematic reviews and Meta-Analyses (PRISMA) guidelines were preceded by methodological studies identifying poor reporting practices [ 23 , 24 ]. In these instances, after the reporting guidelines are published, methodological studies can also be used to monitor uptake of the guidelines.

These studies can also be conducted to inform the state of the art for design, analysis and reporting practices across different types of health research fields, with the aim of improving research practices, and preventing or reducing research waste. For example, Samaan et al. conducted a scoping review of adherence to different reporting guidelines in health care literature [ 18 ]. Methodological studies can also be used to determine the factors associated with reporting practices. For example, Abbade et al. investigated journal characteristics associated with the use of the Participants, Intervention, Comparison, Outcome, Timeframe (PICOT) format in framing research questions in trials of venous ulcer disease [ 11 ].

How often are methodological studies conducted?

There is no clear answer to this question. Based on a search of PubMed, the use of related terms (“methodological review” and “meta-epidemiological study”) – and therefore, the number of methodological studies – is on the rise. However, many other terms are used to describe methodological studies. There are also many studies that explore design, conduct, analysis or reporting of research reports, but that do not use any specific terms to describe or label their study design in terms of “methodology”. This diversity in nomenclature makes a census of methodological studies elusive. Appropriate terminology and key words for methodological studies are needed to facilitate improved accessibility for end-users.

Why do we conduct methodological studies?

Methodological studies provide information on the design, conduct, analysis or reporting of primary and secondary research and can be used to appraise quality, quantity, completeness, accuracy and consistency of health research. These issues can be explored in specific fields, journals, databases, geographical regions and time periods. For example, Areia et al. explored the quality of reporting of endoscopic diagnostic studies in gastroenterology [ 25 ]; Knol et al. investigated the reporting of p -values in baseline tables in randomized trial published in high impact journals [ 26 ]; Chen et al. describe adherence to the Consolidated Standards of Reporting Trials (CONSORT) statement in Chinese Journals [ 27 ]; and Hopewell et al. describe the effect of editors’ implementation of CONSORT guidelines on reporting of abstracts over time [ 28 ]. Methodological studies provide useful information to researchers, clinicians, editors, publishers and users of health literature. As a result, these studies have been at the cornerstone of important methodological developments in the past two decades and have informed the development of many health research guidelines including the highly cited CONSORT statement [ 5 ].

Where can we find methodological studies?

Methodological studies can be found in most common biomedical bibliographic databases (e.g. Embase, MEDLINE, PubMed, Web of Science). However, the biggest caveat is that methodological studies are hard to identify in the literature due to the wide variety of names used and the lack of comprehensive databases dedicated to them. A handful can be found in the Cochrane Library as “Cochrane Methodology Reviews”, but these studies only cover methodological issues related to systematic reviews. Previous attempts to catalogue all empirical studies of methods used in reviews were abandoned 10 years ago [ 29 ]. In other databases, a variety of search terms may be applied with different levels of sensitivity and specificity.

Some frequently asked questions about methodological studies

In this section, we have outlined responses to questions that might help inform the conduct of methodological studies.

Q: How should I select research reports for my methodological study?

A: Selection of research reports for a methodological study depends on the research question and eligibility criteria. Once a clear research question is set and the nature of literature one desires to review is known, one can then begin the selection process. Selection may begin with a broad search, especially if the eligibility criteria are not apparent. For example, a methodological study of Cochrane Reviews of HIV would not require a complex search as all eligible studies can easily be retrieved from the Cochrane Library after checking a few boxes [ 30 ]. On the other hand, a methodological study of subgroup analyses in trials of gastrointestinal oncology would require a search to find such trials, and further screening to identify trials that conducted a subgroup analysis [ 31 ].

The strategies used for identifying participants in observational studies can apply here. One may use a systematic search to identify all eligible studies. If the number of eligible studies is unmanageable, a random sample of articles can be expected to provide comparable results if it is sufficiently large [ 32 ]. For example, Wilson et al. used a random sample of trials from the Cochrane Stroke Group’s Trial Register to investigate completeness of reporting [ 33 ]. It is possible that a simple random sample would lead to underrepresentation of units (i.e. research reports) that are smaller in number. This is relevant if the investigators wish to compare multiple groups but have too few units in one group. In this case a stratified sample would help to create equal groups. For example, in a methodological study comparing Cochrane and non-Cochrane reviews, Kahale et al. drew random samples from both groups [ 34 ]. Alternatively, systematic or purposeful sampling strategies can be used and we encourage researchers to justify their selected approaches based on the study objective.

Q: How many databases should I search?

A: The number of databases one should search would depend on the approach to sampling, which can include targeting the entire “population” of interest or a sample of that population. If you are interested in including the entire target population for your research question, or drawing a random or systematic sample from it, then a comprehensive and exhaustive search for relevant articles is required. In this case, we recommend using systematic approaches for searching electronic databases (i.e. at least 2 databases with a replicable and time stamped search strategy). The results of your search will constitute a sampling frame from which eligible studies can be drawn.

Alternatively, if your approach to sampling is purposeful, then we recommend targeting the database(s) or data sources (e.g. journals, registries) that include the information you need. For example, if you are conducting a methodological study of high impact journals in plastic surgery and they are all indexed in PubMed, you likely do not need to search any other databases. You may also have a comprehensive list of all journals of interest and can approach your search using the journal names in your database search (or by accessing the journal archives directly from the journal’s website). Even though one could also search journals’ web pages directly, using a database such as PubMed has multiple advantages, such as the use of filters, so the search can be narrowed down to a certain period, or study types of interest. Furthermore, individual journals’ web sites may have different search functionalities, which do not necessarily yield a consistent output.

Q: Should I publish a protocol for my methodological study?

A: A protocol is a description of intended research methods. Currently, only protocols for clinical trials require registration [ 35 ]. Protocols for systematic reviews are encouraged but no formal recommendation exists. The scientific community welcomes the publication of protocols because they help protect against selective outcome reporting, the use of post hoc methodologies to embellish results, and to help avoid duplication of efforts [ 36 ]. While the latter two risks exist in methodological research, the negative consequences may be substantially less than for clinical outcomes. In a sample of 31 methodological studies, 7 (22.6%) referenced a published protocol [ 9 ]. In the Cochrane Library, there are 15 protocols for methodological reviews (21 July 2020). This suggests that publishing protocols for methodological studies is not uncommon.

Authors can consider publishing their study protocol in a scholarly journal as a manuscript. Advantages of such publication include obtaining peer-review feedback about the planned study, and easy retrieval by searching databases such as PubMed. The disadvantages in trying to publish protocols includes delays associated with manuscript handling and peer review, as well as costs, as few journals publish study protocols, and those journals mostly charge article-processing fees [ 37 ]. Authors who would like to make their protocol publicly available without publishing it in scholarly journals, could deposit their study protocols in publicly available repositories, such as the Open Science Framework ( https://osf.io/ ).

Q: How to appraise the quality of a methodological study?

A: To date, there is no published tool for appraising the risk of bias in a methodological study, but in principle, a methodological study could be considered as a type of observational study. Therefore, during conduct or appraisal, care should be taken to avoid the biases common in observational studies [ 38 ]. These biases include selection bias, comparability of groups, and ascertainment of exposure or outcome. In other words, to generate a representative sample, a comprehensive reproducible search may be necessary to build a sampling frame. Additionally, random sampling may be necessary to ensure that all the included research reports have the same probability of being selected, and the screening and selection processes should be transparent and reproducible. To ensure that the groups compared are similar in all characteristics, matching, random sampling or stratified sampling can be used. Statistical adjustments for between-group differences can also be applied at the analysis stage. Finally, duplicate data extraction can reduce errors in assessment of exposures or outcomes.

Q: Should I justify a sample size?

A: In all instances where one is not using the target population (i.e. the group to which inferences from the research report are directed) [ 39 ], a sample size justification is good practice. The sample size justification may take the form of a description of what is expected to be achieved with the number of articles selected, or a formal sample size estimation that outlines the number of articles required to answer the research question with a certain precision and power. Sample size justifications in methodological studies are reasonable in the following instances:

Comparing two groups

Determining a proportion, mean or another quantifier

Determining factors associated with an outcome using regression-based analyses

For example, El Dib et al. computed a sample size requirement for a methodological study of diagnostic strategies in randomized trials, based on a confidence interval approach [ 40 ].

Q: What should I call my study?

A: Other terms which have been used to describe/label methodological studies include “ methodological review ”, “methodological survey” , “meta-epidemiological study” , “systematic review” , “systematic survey”, “meta-research”, “research-on-research” and many others. We recommend that the study nomenclature be clear, unambiguous, informative and allow for appropriate indexing. Methodological study nomenclature that should be avoided includes “ systematic review” – as this will likely be confused with a systematic review of a clinical question. “ Systematic survey” may also lead to confusion about whether the survey was systematic (i.e. using a preplanned methodology) or a survey using “ systematic” sampling (i.e. a sampling approach using specific intervals to determine who is selected) [ 32 ]. Any of the above meanings of the words “ systematic” may be true for methodological studies and could be potentially misleading. “ Meta-epidemiological study” is ideal for indexing, but not very informative as it describes an entire field. The term “ review ” may point towards an appraisal or “review” of the design, conduct, analysis or reporting (or methodological components) of the targeted research reports, yet it has also been used to describe narrative reviews [ 41 , 42 ]. The term “ survey ” is also in line with the approaches used in many methodological studies [ 9 ], and would be indicative of the sampling procedures of this study design. However, in the absence of guidelines on nomenclature, the term “ methodological study ” is broad enough to capture most of the scenarios of such studies.

Q: Should I account for clustering in my methodological study?

A: Data from methodological studies are often clustered. For example, articles coming from a specific source may have different reporting standards (e.g. the Cochrane Library). Articles within the same journal may be similar due to editorial practices and policies, reporting requirements and endorsement of guidelines. There is emerging evidence that these are real concerns that should be accounted for in analyses [ 43 ]. Some cluster variables are described in the section: “ What variables are relevant to methodological studies?”

A variety of modelling approaches can be used to account for correlated data, including the use of marginal, fixed or mixed effects regression models with appropriate computation of standard errors [ 44 ]. For example, Kosa et al. used generalized estimation equations to account for correlation of articles within journals [ 15 ]. Not accounting for clustering could lead to incorrect p -values, unduly narrow confidence intervals, and biased estimates [ 45 ].

Q: Should I extract data in duplicate?

A: Yes. Duplicate data extraction takes more time but results in less errors [ 19 ]. Data extraction errors in turn affect the effect estimate [ 46 ], and therefore should be mitigated. Duplicate data extraction should be considered in the absence of other approaches to minimize extraction errors. However, much like systematic reviews, this area will likely see rapid new advances with machine learning and natural language processing technologies to support researchers with screening and data extraction [ 47 , 48 ]. However, experience plays an important role in the quality of extracted data and inexperienced extractors should be paired with experienced extractors [ 46 , 49 ].

Q: Should I assess the risk of bias of research reports included in my methodological study?

A : Risk of bias is most useful in determining the certainty that can be placed in the effect measure from a study. In methodological studies, risk of bias may not serve the purpose of determining the trustworthiness of results, as effect measures are often not the primary goal of methodological studies. Determining risk of bias in methodological studies is likely a practice borrowed from systematic review methodology, but whose intrinsic value is not obvious in methodological studies. When it is part of the research question, investigators often focus on one aspect of risk of bias. For example, Speich investigated how blinding was reported in surgical trials [ 50 ], and Abraha et al., investigated the application of intention-to-treat analyses in systematic reviews and trials [ 51 ].

Q: What variables are relevant to methodological studies?

A: There is empirical evidence that certain variables may inform the findings in a methodological study. We outline some of these and provide a brief overview below:

Country: Countries and regions differ in their research cultures, and the resources available to conduct research. Therefore, it is reasonable to believe that there may be differences in methodological features across countries. Methodological studies have reported loco-regional differences in reporting quality [ 52 , 53 ]. This may also be related to challenges non-English speakers face in publishing papers in English.

Authors’ expertise: The inclusion of authors with expertise in research methodology, biostatistics, and scientific writing is likely to influence the end-product. Oltean et al. found that among randomized trials in orthopaedic surgery, the use of analyses that accounted for clustering was more likely when specialists (e.g. statistician, epidemiologist or clinical trials methodologist) were included on the study team [ 54 ]. Fleming et al. found that including methodologists in the review team was associated with appropriate use of reporting guidelines [ 55 ].

Source of funding and conflicts of interest: Some studies have found that funded studies report better [ 56 , 57 ], while others do not [ 53 , 58 ]. The presence of funding would indicate the availability of resources deployed to ensure optimal design, conduct, analysis and reporting. However, the source of funding may introduce conflicts of interest and warrant assessment. For example, Kaiser et al. investigated the effect of industry funding on obesity or nutrition randomized trials and found that reporting quality was similar [ 59 ]. Thomas et al. looked at reporting quality of long-term weight loss trials and found that industry funded studies were better [ 60 ]. Kan et al. examined the association between industry funding and “positive trials” (trials reporting a significant intervention effect) and found that industry funding was highly predictive of a positive trial [ 61 ]. This finding is similar to that of a recent Cochrane Methodology Review by Hansen et al. [ 62 ]

Journal characteristics: Certain journals’ characteristics may influence the study design, analysis or reporting. Characteristics such as journal endorsement of guidelines [ 63 , 64 ], and Journal Impact Factor (JIF) have been shown to be associated with reporting [ 63 , 65 , 66 , 67 ].

Study size (sample size/number of sites): Some studies have shown that reporting is better in larger studies [ 53 , 56 , 58 ].

Year of publication: It is reasonable to assume that design, conduct, analysis and reporting of research will change over time. Many studies have demonstrated improvements in reporting over time or after the publication of reporting guidelines [ 68 , 69 ].

Type of intervention: In a methodological study of reporting quality of weight loss intervention studies, Thabane et al. found that trials of pharmacologic interventions were reported better than trials of non-pharmacologic interventions [ 70 ].

Interactions between variables: Complex interactions between the previously listed variables are possible. High income countries with more resources may be more likely to conduct larger studies and incorporate a variety of experts. Authors in certain countries may prefer certain journals, and journal endorsement of guidelines and editorial policies may change over time.

Q: Should I focus only on high impact journals?

A: Investigators may choose to investigate only high impact journals because they are more likely to influence practice and policy, or because they assume that methodological standards would be higher. However, the JIF may severely limit the scope of articles included and may skew the sample towards articles with positive findings. The generalizability and applicability of findings from a handful of journals must be examined carefully, especially since the JIF varies over time. Even among journals that are all “high impact”, variations exist in methodological standards.

Q: Can I conduct a methodological study of qualitative research?

A: Yes. Even though a lot of methodological research has been conducted in the quantitative research field, methodological studies of qualitative studies are feasible. Certain databases that catalogue qualitative research including the Cumulative Index to Nursing & Allied Health Literature (CINAHL) have defined subject headings that are specific to methodological research (e.g. “research methodology”). Alternatively, one could also conduct a qualitative methodological review; that is, use qualitative approaches to synthesize methodological issues in qualitative studies.

Q: What reporting guidelines should I use for my methodological study?

A: There is no guideline that covers the entire scope of methodological studies. One adaptation of the PRISMA guidelines has been published, which works well for studies that aim to use the entire target population of research reports [ 71 ]. However, it is not widely used (40 citations in 2 years as of 09 December 2019), and methodological studies that are designed as cross-sectional or before-after studies require a more fit-for purpose guideline. A more encompassing reporting guideline for a broad range of methodological studies is currently under development [ 72 ]. However, in the absence of formal guidance, the requirements for scientific reporting should be respected, and authors of methodological studies should focus on transparency and reproducibility.

Q: What are the potential threats to validity and how can I avoid them?

A: Methodological studies may be compromised by a lack of internal or external validity. The main threats to internal validity in methodological studies are selection and confounding bias. Investigators must ensure that the methods used to select articles does not make them differ systematically from the set of articles to which they would like to make inferences. For example, attempting to make extrapolations to all journals after analyzing high-impact journals would be misleading.

Many factors (confounders) may distort the association between the exposure and outcome if the included research reports differ with respect to these factors [ 73 ]. For example, when examining the association between source of funding and completeness of reporting, it may be necessary to account for journals that endorse the guidelines. Confounding bias can be addressed by restriction, matching and statistical adjustment [ 73 ]. Restriction appears to be the method of choice for many investigators who choose to include only high impact journals or articles in a specific field. For example, Knol et al. examined the reporting of p -values in baseline tables of high impact journals [ 26 ]. Matching is also sometimes used. In the methodological study of non-randomized interventional studies of elective ventral hernia repair, Parker et al. matched prospective studies with retrospective studies and compared reporting standards [ 74 ]. Some other methodological studies use statistical adjustments. For example, Zhang et al. used regression techniques to determine the factors associated with missing participant data in trials [ 16 ].

With regard to external validity, researchers interested in conducting methodological studies must consider how generalizable or applicable their findings are. This should tie in closely with the research question and should be explicit. For example. Findings from methodological studies on trials published in high impact cardiology journals cannot be assumed to be applicable to trials in other fields. However, investigators must ensure that their sample truly represents the target sample either by a) conducting a comprehensive and exhaustive search, or b) using an appropriate and justified, randomly selected sample of research reports.

Even applicability to high impact journals may vary based on the investigators’ definition, and over time. For example, for high impact journals in the field of general medicine, Bouwmeester et al. included the Annals of Internal Medicine (AIM), BMJ, the Journal of the American Medical Association (JAMA), Lancet, the New England Journal of Medicine (NEJM), and PLoS Medicine ( n  = 6) [ 75 ]. In contrast, the high impact journals selected in the methodological study by Schiller et al. were BMJ, JAMA, Lancet, and NEJM ( n  = 4) [ 76 ]. Another methodological study by Kosa et al. included AIM, BMJ, JAMA, Lancet and NEJM ( n  = 5). In the methodological study by Thabut et al., journals with a JIF greater than 5 were considered to be high impact. Riado Minguez et al. used first quartile journals in the Journal Citation Reports (JCR) for a specific year to determine “high impact” [ 77 ]. Ultimately, the definition of high impact will be based on the number of journals the investigators are willing to include, the year of impact and the JIF cut-off [ 78 ]. We acknowledge that the term “generalizability” may apply differently for methodological studies, especially when in many instances it is possible to include the entire target population in the sample studied.

Finally, methodological studies are not exempt from information bias which may stem from discrepancies in the included research reports [ 79 ], errors in data extraction, or inappropriate interpretation of the information extracted. Likewise, publication bias may also be a concern in methodological studies, but such concepts have not yet been explored.

A proposed framework

In order to inform discussions about methodological studies, the development of guidance for what should be reported, we have outlined some key features of methodological studies that can be used to classify them. For each of the categories outlined below, we provide an example. In our experience, the choice of approach to completing a methodological study can be informed by asking the following four questions:

What is the aim?

Methodological studies that investigate bias

A methodological study may be focused on exploring sources of bias in primary or secondary studies (meta-bias), or how bias is analyzed. We have taken care to distinguish bias (i.e. systematic deviations from the truth irrespective of the source) from reporting quality or completeness (i.e. not adhering to a specific reporting guideline or norm). An example of where this distinction would be important is in the case of a randomized trial with no blinding. This study (depending on the nature of the intervention) would be at risk of performance bias. However, if the authors report that their study was not blinded, they would have reported adequately. In fact, some methodological studies attempt to capture both “quality of conduct” and “quality of reporting”, such as Richie et al., who reported on the risk of bias in randomized trials of pharmacy practice interventions [ 80 ]. Babic et al. investigated how risk of bias was used to inform sensitivity analyses in Cochrane reviews [ 81 ]. Further, biases related to choice of outcomes can also be explored. For example, Tan et al investigated differences in treatment effect size based on the outcome reported [ 82 ].

Methodological studies that investigate quality (or completeness) of reporting

Methodological studies may report quality of reporting against a reporting checklist (i.e. adherence to guidelines) or against expected norms. For example, Croituro et al. report on the quality of reporting in systematic reviews published in dermatology journals based on their adherence to the PRISMA statement [ 83 ], and Khan et al. described the quality of reporting of harms in randomized controlled trials published in high impact cardiovascular journals based on the CONSORT extension for harms [ 84 ]. Other methodological studies investigate reporting of certain features of interest that may not be part of formally published checklists or guidelines. For example, Mbuagbaw et al. described how often the implications for research are elaborated using the Evidence, Participants, Intervention, Comparison, Outcome, Timeframe (EPICOT) format [ 30 ].

Methodological studies that investigate the consistency of reporting

Sometimes investigators may be interested in how consistent reports of the same research are, as it is expected that there should be consistency between: conference abstracts and published manuscripts; manuscript abstracts and manuscript main text; and trial registration and published manuscript. For example, Rosmarakis et al. investigated consistency between conference abstracts and full text manuscripts [ 85 ].

Methodological studies that investigate factors associated with reporting

In addition to identifying issues with reporting in primary and secondary studies, authors of methodological studies may be interested in determining the factors that are associated with certain reporting practices. Many methodological studies incorporate this, albeit as a secondary outcome. For example, Farrokhyar et al. investigated the factors associated with reporting quality in randomized trials of coronary artery bypass grafting surgery [ 53 ].

Methodological studies that investigate methods

Methodological studies may also be used to describe methods or compare methods, and the factors associated with methods. Muller et al. described the methods used for systematic reviews and meta-analyses of observational studies [ 86 ].

Methodological studies that summarize other methodological studies

Some methodological studies synthesize results from other methodological studies. For example, Li et al. conducted a scoping review of methodological reviews that investigated consistency between full text and abstracts in primary biomedical research [ 87 ].

Methodological studies that investigate nomenclature and terminology

Some methodological studies may investigate the use of names and terms in health research. For example, Martinic et al. investigated the definitions of systematic reviews used in overviews of systematic reviews (OSRs), meta-epidemiological studies and epidemiology textbooks [ 88 ].

Other types of methodological studies

In addition to the previously mentioned experimental methodological studies, there may exist other types of methodological studies not captured here.

What is the design?

Methodological studies that are descriptive

Most methodological studies are purely descriptive and report their findings as counts (percent) and means (standard deviation) or medians (interquartile range). For example, Mbuagbaw et al. described the reporting of research recommendations in Cochrane HIV systematic reviews [ 30 ]. Gohari et al. described the quality of reporting of randomized trials in diabetes in Iran [ 12 ].

Methodological studies that are analytical

Some methodological studies are analytical wherein “analytical studies identify and quantify associations, test hypotheses, identify causes and determine whether an association exists between variables, such as between an exposure and a disease.” [ 89 ] In the case of methodological studies all these investigations are possible. For example, Kosa et al. investigated the association between agreement in primary outcome from trial registry to published manuscript and study covariates. They found that larger and more recent studies were more likely to have agreement [ 15 ]. Tricco et al. compared the conclusion statements from Cochrane and non-Cochrane systematic reviews with a meta-analysis of the primary outcome and found that non-Cochrane reviews were more likely to report positive findings. These results are a test of the null hypothesis that the proportions of Cochrane and non-Cochrane reviews that report positive results are equal [ 90 ].

What is the sampling strategy?

Methodological studies that include the target population

Methodological reviews with narrow research questions may be able to include the entire target population. For example, in the methodological study of Cochrane HIV systematic reviews, Mbuagbaw et al. included all of the available studies ( n  = 103) [ 30 ].

Methodological studies that include a sample of the target population

Many methodological studies use random samples of the target population [ 33 , 91 , 92 ]. Alternatively, purposeful sampling may be used, limiting the sample to a subset of research-related reports published within a certain time period, or in journals with a certain ranking or on a topic. Systematic sampling can also be used when random sampling may be challenging to implement.

What is the unit of analysis?

Methodological studies with a research report as the unit of analysis

Many methodological studies use a research report (e.g. full manuscript of study, abstract portion of the study) as the unit of analysis, and inferences can be made at the study-level. However, both published and unpublished research-related reports can be studied. These may include articles, conference abstracts, registry entries etc.

Methodological studies with a design, analysis or reporting item as the unit of analysis

Some methodological studies report on items which may occur more than once per article. For example, Paquette et al. report on subgroup analyses in Cochrane reviews of atrial fibrillation in which 17 systematic reviews planned 56 subgroup analyses [ 93 ].

This framework is outlined in Fig.  2 .

figure 2

A proposed framework for methodological studies


Methodological studies have examined different aspects of reporting such as quality, completeness, consistency and adherence to reporting guidelines. As such, many of the methodological study examples cited in this tutorial are related to reporting. However, as an evolving field, the scope of research questions that can be addressed by methodological studies is expected to increase.

In this paper we have outlined the scope and purpose of methodological studies, along with examples of instances in which various approaches have been used. In the absence of formal guidance on the design, conduct, analysis and reporting of methodological studies, we have provided some advice to help make methodological studies consistent. This advice is grounded in good contemporary scientific practice. Generally, the research question should tie in with the sampling approach and planned analysis. We have also highlighted the variables that may inform findings from methodological studies. Lastly, we have provided suggestions for ways in which authors can categorize their methodological studies to inform their design and analysis.

Availability of data and materials

Data sharing is not applicable to this article as no new data were created or analyzed in this study.


Consolidated Standards of Reporting Trials

Evidence, Participants, Intervention, Comparison, Outcome, Timeframe

Grading of Recommendations, Assessment, Development and Evaluations

Participants, Intervention, Comparison, Outcome, Timeframe

Preferred Reporting Items of Systematic reviews and Meta-Analyses

Studies Within a Review

Studies Within a Trial

Chalmers I, Glasziou P. Avoidable waste in the production and reporting of research evidence. Lancet. 2009;374(9683):86–9.

PubMed   Google Scholar  

Chan AW, Song F, Vickers A, Jefferson T, Dickersin K, Gotzsche PC, Krumholz HM, Ghersi D, van der Worp HB. Increasing value and reducing waste: addressing inaccessible research. Lancet. 2014;383(9913):257–66.

PubMed   PubMed Central   Google Scholar  

Ioannidis JP, Greenland S, Hlatky MA, Khoury MJ, Macleod MR, Moher D, Schulz KF, Tibshirani R. Increasing value and reducing waste in research design, conduct, and analysis. Lancet. 2014;383(9912):166–75.

Higgins JP, Altman DG, Gotzsche PC, Juni P, Moher D, Oxman AD, Savovic J, Schulz KF, Weeks L, Sterne JA. The Cochrane Collaboration's tool for assessing risk of bias in randomised trials. BMJ. 2011;343:d5928.

Moher D, Schulz KF, Altman DG. The CONSORT statement: revised recommendations for improving the quality of reports of parallel-group randomised trials. Lancet. 2001;357.

Liberati A, Altman DG, Tetzlaff J, Mulrow C, Gotzsche PC, Ioannidis JP, Clarke M, Devereaux PJ, Kleijnen J, Moher D. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: explanation and elaboration. PLoS Med. 2009;6(7):e1000100.

Shea BJ, Hamel C, Wells GA, Bouter LM, Kristjansson E, Grimshaw J, Henry DA, Boers M. AMSTAR is a reliable and valid measurement tool to assess the methodological quality of systematic reviews. J Clin Epidemiol. 2009;62(10):1013–20.

Shea BJ, Reeves BC, Wells G, Thuku M, Hamel C, Moran J, Moher D, Tugwell P, Welch V, Kristjansson E, et al. AMSTAR 2: a critical appraisal tool for systematic reviews that include randomised or non-randomised studies of healthcare interventions, or both. Bmj. 2017;358:j4008.

Lawson DO, Leenus A, Mbuagbaw L. Mapping the nomenclature, methodology, and reporting of studies that review methods: a pilot methodological review. Pilot Feasibility Studies. 2020;6(1):13.

Puljak L, Makaric ZL, Buljan I, Pieper D. What is a meta-epidemiological study? Analysis of published literature indicated heterogeneous study designs and definitions. J Comp Eff Res. 2020.

Abbade LPF, Wang M, Sriganesh K, Jin Y, Mbuagbaw L, Thabane L. The framing of research questions using the PICOT format in randomized controlled trials of venous ulcer disease is suboptimal: a systematic survey. Wound Repair Regen. 2017;25(5):892–900.

Gohari F, Baradaran HR, Tabatabaee M, Anijidani S, Mohammadpour Touserkani F, Atlasi R, Razmgir M. Quality of reporting randomized controlled trials (RCTs) in diabetes in Iran; a systematic review. J Diabetes Metab Disord. 2015;15(1):36.

Wang M, Jin Y, Hu ZJ, Thabane A, Dennis B, Gajic-Veljanoski O, Paul J, Thabane L. The reporting quality of abstracts of stepped wedge randomized trials is suboptimal: a systematic survey of the literature. Contemp Clin Trials Commun. 2017;8:1–10.

Shanthanna H, Kaushal A, Mbuagbaw L, Couban R, Busse J, Thabane L: A cross-sectional study of the reporting quality of pilot or feasibility trials in high-impact anesthesia journals Can J Anaesthesia 2018, 65(11):1180–1195.

Kosa SD, Mbuagbaw L, Borg Debono V, Bhandari M, Dennis BB, Ene G, Leenus A, Shi D, Thabane M, Valvasori S, et al. Agreement in reporting between trial publications and current clinical trial registry in high impact journals: a methodological review. Contemporary Clinical Trials. 2018;65:144–50.

Zhang Y, Florez ID, Colunga Lozano LE, Aloweni FAB, Kennedy SA, Li A, Craigie S, Zhang S, Agarwal A, Lopes LC, et al. A systematic survey on reporting and methods for handling missing participant data for continuous outcomes in randomized controlled trials. J Clin Epidemiol. 2017;88:57–66.

CAS   PubMed   Google Scholar  

Hernández AV, Boersma E, Murray GD, Habbema JD, Steyerberg EW. Subgroup analyses in therapeutic cardiovascular clinical trials: are most of them misleading? Am Heart J. 2006;151(2):257–64.

Samaan Z, Mbuagbaw L, Kosa D, Borg Debono V, Dillenburg R, Zhang S, Fruci V, Dennis B, Bawor M, Thabane L. A systematic scoping review of adherence to reporting guidelines in health care literature. J Multidiscip Healthc. 2013;6:169–88.

Buscemi N, Hartling L, Vandermeer B, Tjosvold L, Klassen TP. Single data extraction generated more errors than double data extraction in systematic reviews. J Clin Epidemiol. 2006;59(7):697–703.

Carrasco-Labra A, Brignardello-Petersen R, Santesso N, Neumann I, Mustafa RA, Mbuagbaw L, Etxeandia Ikobaltzeta I, De Stio C, McCullagh LJ, Alonso-Coello P. Improving GRADE evidence tables part 1: a randomized trial shows improved understanding of content in summary-of-findings tables with a new format. J Clin Epidemiol. 2016;74:7–18.

The Northern Ireland Hub for Trials Methodology Research: SWAT/SWAR Information [ https://www.qub.ac.uk/sites/TheNorthernIrelandNetworkforTrialsMethodologyResearch/SWATSWARInformation/ ]. Accessed 31 Aug 2020.

Chick S, Sánchez P, Ferrin D, Morrice D. How to conduct a successful simulation study. In: Proceedings of the 2003 winter simulation conference: 2003; 2003. p. 66–70.

Google Scholar  

Mulrow CD. The medical review article: state of the science. Ann Intern Med. 1987;106(3):485–8.

Sacks HS, Reitman D, Pagano D, Kupelnick B. Meta-analysis: an update. Mount Sinai J Med New York. 1996;63(3–4):216–24.

CAS   Google Scholar  

Areia M, Soares M, Dinis-Ribeiro M. Quality reporting of endoscopic diagnostic studies in gastrointestinal journals: where do we stand on the use of the STARD and CONSORT statements? Endoscopy. 2010;42(2):138–47.

Knol M, Groenwold R, Grobbee D. P-values in baseline tables of randomised controlled trials are inappropriate but still common in high impact journals. Eur J Prev Cardiol. 2012;19(2):231–2.

Chen M, Cui J, Zhang AL, Sze DM, Xue CC, May BH. Adherence to CONSORT items in randomized controlled trials of integrative medicine for colorectal Cancer published in Chinese journals. J Altern Complement Med. 2018;24(2):115–24.

Hopewell S, Ravaud P, Baron G, Boutron I. Effect of editors' implementation of CONSORT guidelines on the reporting of abstracts in high impact medical journals: interrupted time series analysis. BMJ. 2012;344:e4178.

The Cochrane Methodology Register Issue 2 2009 [ https://cmr.cochrane.org/help.htm ]. Accessed 31 Aug 2020.

Mbuagbaw L, Kredo T, Welch V, Mursleen S, Ross S, Zani B, Motaze NV, Quinlan L. Critical EPICOT items were absent in Cochrane human immunodeficiency virus systematic reviews: a bibliometric analysis. J Clin Epidemiol. 2016;74:66–72.

Barton S, Peckitt C, Sclafani F, Cunningham D, Chau I. The influence of industry sponsorship on the reporting of subgroup analyses within phase III randomised controlled trials in gastrointestinal oncology. Eur J Cancer. 2015;51(18):2732–9.

Setia MS. Methodology series module 5: sampling strategies. Indian J Dermatol. 2016;61(5):505–9.

Wilson B, Burnett P, Moher D, Altman DG, Al-Shahi Salman R. Completeness of reporting of randomised controlled trials including people with transient ischaemic attack or stroke: a systematic review. Eur Stroke J. 2018;3(4):337–46.

Kahale LA, Diab B, Brignardello-Petersen R, Agarwal A, Mustafa RA, Kwong J, Neumann I, Li L, Lopes LC, Briel M, et al. Systematic reviews do not adequately report or address missing outcome data in their analyses: a methodological survey. J Clin Epidemiol. 2018;99:14–23.

De Angelis CD, Drazen JM, Frizelle FA, Haug C, Hoey J, Horton R, Kotzin S, Laine C, Marusic A, Overbeke AJPM, et al. Is this clinical trial fully registered?: a statement from the International Committee of Medical Journal Editors*. Ann Intern Med. 2005;143(2):146–8.

Ohtake PJ, Childs JD. Why publish study protocols? Phys Ther. 2014;94(9):1208–9.

Rombey T, Allers K, Mathes T, Hoffmann F, Pieper D. A descriptive analysis of the characteristics and the peer review process of systematic review protocols published in an open peer review journal from 2012 to 2017. BMC Med Res Methodol. 2019;19(1):57.

Grimes DA, Schulz KF. Bias and causal associations in observational research. Lancet. 2002;359(9302):248–52.

Porta M (ed.): A dictionary of epidemiology, 5th edn. Oxford: Oxford University Press, Inc.; 2008.

El Dib R, Tikkinen KAO, Akl EA, Gomaa HA, Mustafa RA, Agarwal A, Carpenter CR, Zhang Y, Jorge EC, Almeida R, et al. Systematic survey of randomized trials evaluating the impact of alternative diagnostic strategies on patient-important outcomes. J Clin Epidemiol. 2017;84:61–9.

Helzer JE, Robins LN, Taibleson M, Woodruff RA Jr, Reich T, Wish ED. Reliability of psychiatric diagnosis. I. a methodological review. Arch Gen Psychiatry. 1977;34(2):129–33.

Chung ST, Chacko SK, Sunehag AL, Haymond MW. Measurements of gluconeogenesis and Glycogenolysis: a methodological review. Diabetes. 2015;64(12):3996–4010.

CAS   PubMed   PubMed Central   Google Scholar  

Sterne JA, Juni P, Schulz KF, Altman DG, Bartlett C, Egger M. Statistical methods for assessing the influence of study characteristics on treatment effects in 'meta-epidemiological' research. Stat Med. 2002;21(11):1513–24.

Moen EL, Fricano-Kugler CJ, Luikart BW, O’Malley AJ. Analyzing clustered data: why and how to account for multiple observations nested within a study participant? PLoS One. 2016;11(1):e0146721.

Zyzanski SJ, Flocke SA, Dickinson LM. On the nature and analysis of clustered data. Ann Fam Med. 2004;2(3):199–200.

Mathes T, Klassen P, Pieper D. Frequency of data extraction errors and methods to increase data extraction quality: a methodological review. BMC Med Res Methodol. 2017;17(1):152.

Bui DDA, Del Fiol G, Hurdle JF, Jonnalagadda S. Extractive text summarization system to aid data extraction from full text in systematic review development. J Biomed Inform. 2016;64:265–72.

Bui DD, Del Fiol G, Jonnalagadda S. PDF text classification to leverage information extraction from publication reports. J Biomed Inform. 2016;61:141–8.

Maticic K, Krnic Martinic M, Puljak L. Assessment of reporting quality of abstracts of systematic reviews with meta-analysis using PRISMA-A and discordance in assessments between raters without prior experience. BMC Med Res Methodol. 2019;19(1):32.

Speich B. Blinding in surgical randomized clinical trials in 2015. Ann Surg. 2017;266(1):21–2.

Abraha I, Cozzolino F, Orso M, Marchesi M, Germani A, Lombardo G, Eusebi P, De Florio R, Luchetta ML, Iorio A, et al. A systematic review found that deviations from intention-to-treat are common in randomized trials and systematic reviews. J Clin Epidemiol. 2017;84:37–46.

Zhong Y, Zhou W, Jiang H, Fan T, Diao X, Yang H, Min J, Wang G, Fu J, Mao B. Quality of reporting of two-group parallel randomized controlled clinical trials of multi-herb formulae: A survey of reports indexed in the Science Citation Index Expanded. Eur J Integrative Med. 2011;3(4):e309–16.

Farrokhyar F, Chu R, Whitlock R, Thabane L. A systematic review of the quality of publications reporting coronary artery bypass grafting trials. Can J Surg. 2007;50(4):266–77.

Oltean H, Gagnier JJ. Use of clustering analysis in randomized controlled trials in orthopaedic surgery. BMC Med Res Methodol. 2015;15:17.

Fleming PS, Koletsi D, Pandis N. Blinded by PRISMA: are systematic reviewers focusing on PRISMA and ignoring other guidelines? PLoS One. 2014;9(5):e96407.

Balasubramanian SP, Wiener M, Alshameeri Z, Tiruvoipati R, Elbourne D, Reed MW. Standards of reporting of randomized controlled trials in general surgery: can we do better? Ann Surg. 2006;244(5):663–7.

de Vries TW, van Roon EN. Low quality of reporting adverse drug reactions in paediatric randomised controlled trials. Arch Dis Child. 2010;95(12):1023–6.

Borg Debono V, Zhang S, Ye C, Paul J, Arya A, Hurlburt L, Murthy Y, Thabane L. The quality of reporting of RCTs used within a postoperative pain management meta-analysis, using the CONSORT statement. BMC Anesthesiol. 2012;12:13.

Kaiser KA, Cofield SS, Fontaine KR, Glasser SP, Thabane L, Chu R, Ambrale S, Dwary AD, Kumar A, Nayyar G, et al. Is funding source related to study reporting quality in obesity or nutrition randomized control trials in top-tier medical journals? Int J Obes. 2012;36(7):977–81.

Thomas O, Thabane L, Douketis J, Chu R, Westfall AO, Allison DB. Industry funding and the reporting quality of large long-term weight loss trials. Int J Obes. 2008;32(10):1531–6.

Khan NR, Saad H, Oravec CS, Rossi N, Nguyen V, Venable GT, Lillard JC, Patel P, Taylor DR, Vaughn BN, et al. A review of industry funding in randomized controlled trials published in the neurosurgical literature-the elephant in the room. Neurosurgery. 2018;83(5):890–7.

Hansen C, Lundh A, Rasmussen K, Hrobjartsson A. Financial conflicts of interest in systematic reviews: associations with results, conclusions, and methodological quality. Cochrane Database Syst Rev. 2019;8:Mr000047.

Kiehna EN, Starke RM, Pouratian N, Dumont AS. Standards for reporting randomized controlled trials in neurosurgery. J Neurosurg. 2011;114(2):280–5.

Liu LQ, Morris PJ, Pengel LH. Compliance to the CONSORT statement of randomized controlled trials in solid organ transplantation: a 3-year overview. Transpl Int. 2013;26(3):300–6.

Bala MM, Akl EA, Sun X, Bassler D, Mertz D, Mejza F, Vandvik PO, Malaga G, Johnston BC, Dahm P, et al. Randomized trials published in higher vs. lower impact journals differ in design, conduct, and analysis. J Clin Epidemiol. 2013;66(3):286–95.

Lee SY, Teoh PJ, Camm CF, Agha RA. Compliance of randomized controlled trials in trauma surgery with the CONSORT statement. J Trauma Acute Care Surg. 2013;75(4):562–72.

Ziogas DC, Zintzaras E. Analysis of the quality of reporting of randomized controlled trials in acute and chronic myeloid leukemia, and myelodysplastic syndromes as governed by the CONSORT statement. Ann Epidemiol. 2009;19(7):494–500.

Alvarez F, Meyer N, Gourraud PA, Paul C. CONSORT adoption and quality of reporting of randomized controlled trials: a systematic analysis in two dermatology journals. Br J Dermatol. 2009;161(5):1159–65.

Mbuagbaw L, Thabane M, Vanniyasingam T, Borg Debono V, Kosa S, Zhang S, Ye C, Parpia S, Dennis BB, Thabane L. Improvement in the quality of abstracts in major clinical journals since CONSORT extension for abstracts: a systematic review. Contemporary Clin trials. 2014;38(2):245–50.

Thabane L, Chu R, Cuddy K, Douketis J. What is the quality of reporting in weight loss intervention studies? A systematic review of randomized controlled trials. Int J Obes. 2007;31(10):1554–9.

Murad MH, Wang Z. Guidelines for reporting meta-epidemiological methodology research. Evidence Based Med. 2017;22(4):139.

METRIC - MEthodological sTudy ReportIng Checklist: guidelines for reporting methodological studies in health research [ http://www.equator-network.org/library/reporting-guidelines-under-development/reporting-guidelines-under-development-for-other-study-designs/#METRIC ]. Accessed 31 Aug 2020.

Jager KJ, Zoccali C, MacLeod A, Dekker FW. Confounding: what it is and how to deal with it. Kidney Int. 2008;73(3):256–60.

Parker SG, Halligan S, Erotocritou M, Wood CPJ, Boulton RW, Plumb AAO, Windsor ACJ, Mallett S. A systematic methodological review of non-randomised interventional studies of elective ventral hernia repair: clear definitions and a standardised minimum dataset are needed. Hernia. 2019.

Bouwmeester W, Zuithoff NPA, Mallett S, Geerlings MI, Vergouwe Y, Steyerberg EW, Altman DG, Moons KGM. Reporting and methods in clinical prediction research: a systematic review. PLoS Med. 2012;9(5):1–12.

Schiller P, Burchardi N, Niestroj M, Kieser M. Quality of reporting of clinical non-inferiority and equivalence randomised trials--update and extension. Trials. 2012;13:214.

Riado Minguez D, Kowalski M, Vallve Odena M, Longin Pontzen D, Jelicic Kadic A, Jeric M, Dosenovic S, Jakus D, Vrdoljak M, Poklepovic Pericic T, et al. Methodological and reporting quality of systematic reviews published in the highest ranking journals in the field of pain. Anesth Analg. 2017;125(4):1348–54.

Thabut G, Estellat C, Boutron I, Samama CM, Ravaud P. Methodological issues in trials assessing primary prophylaxis of venous thrombo-embolism. Eur Heart J. 2005;27(2):227–36.

Puljak L, Riva N, Parmelli E, González-Lorenzo M, Moja L, Pieper D. Data extraction methods: an analysis of internal reporting discrepancies in single manuscripts and practical advice. J Clin Epidemiol. 2020;117:158–64.

Ritchie A, Seubert L, Clifford R, Perry D, Bond C. Do randomised controlled trials relevant to pharmacy meet best practice standards for quality conduct and reporting? A systematic review. Int J Pharm Pract. 2019.

Babic A, Vuka I, Saric F, Proloscic I, Slapnicar E, Cavar J, Pericic TP, Pieper D, Puljak L. Overall bias methods and their use in sensitivity analysis of Cochrane reviews were not consistent. J Clin Epidemiol. 2019.

Tan A, Porcher R, Crequit P, Ravaud P, Dechartres A. Differences in treatment effect size between overall survival and progression-free survival in immunotherapy trials: a Meta-epidemiologic study of trials with results posted at ClinicalTrials.gov. J Clin Oncol. 2017;35(15):1686–94.

Croitoru D, Huang Y, Kurdina A, Chan AW, Drucker AM. Quality of reporting in systematic reviews published in dermatology journals. Br J Dermatol. 2020;182(6):1469–76.

Khan MS, Ochani RK, Shaikh A, Vaduganathan M, Khan SU, Fatima K, Yamani N, Mandrola J, Doukky R, Krasuski RA: Assessing the Quality of Reporting of Harms in Randomized Controlled Trials Published in High Impact Cardiovascular Journals. Eur Heart J Qual Care Clin Outcomes 2019.

Rosmarakis ES, Soteriades ES, Vergidis PI, Kasiakou SK, Falagas ME. From conference abstract to full paper: differences between data presented in conferences and journals. FASEB J. 2005;19(7):673–80.

Mueller M, D’Addario M, Egger M, Cevallos M, Dekkers O, Mugglin C, Scott P. Methods to systematically review and meta-analyse observational studies: a systematic scoping review of recommendations. BMC Med Res Methodol. 2018;18(1):44.

Li G, Abbade LPF, Nwosu I, Jin Y, Leenus A, Maaz M, Wang M, Bhatt M, Zielinski L, Sanger N, et al. A scoping review of comparisons between abstracts and full reports in primary biomedical research. BMC Med Res Methodol. 2017;17(1):181.

Krnic Martinic M, Pieper D, Glatt A, Puljak L. Definition of a systematic review used in overviews of systematic reviews, meta-epidemiological studies and textbooks. BMC Med Res Methodol. 2019;19(1):203.

Analytical study [ https://medical-dictionary.thefreedictionary.com/analytical+study ]. Accessed 31 Aug 2020.

Tricco AC, Tetzlaff J, Pham B, Brehaut J, Moher D. Non-Cochrane vs. Cochrane reviews were twice as likely to have positive conclusion statements: cross-sectional study. J Clin Epidemiol. 2009;62(4):380–6 e381.

Schalken N, Rietbergen C. The reporting quality of systematic reviews and Meta-analyses in industrial and organizational psychology: a systematic review. Front Psychol. 2017;8:1395.

Ranker LR, Petersen JM, Fox MP. Awareness of and potential for dependent error in the observational epidemiologic literature: A review. Ann Epidemiol. 2019;36:15–9 e12.

Paquette M, Alotaibi AM, Nieuwlaat R, Santesso N, Mbuagbaw L. A meta-epidemiological study of subgroup analyses in cochrane systematic reviews of atrial fibrillation. Syst Rev. 2019;8(1):241.

Download references


This work did not receive any dedicated funding.

Author information

Authors and affiliations.

Department of Health Research Methods, Evidence and Impact, McMaster University, Hamilton, ON, Canada

Lawrence Mbuagbaw, Daeria O. Lawson & Lehana Thabane

Biostatistics Unit/FSORC, 50 Charlton Avenue East, St Joseph’s Healthcare—Hamilton, 3rd Floor Martha Wing, Room H321, Hamilton, Ontario, L8N 4A6, Canada

Lawrence Mbuagbaw & Lehana Thabane

Centre for the Development of Best Practices in Health, Yaoundé, Cameroon

Lawrence Mbuagbaw

Center for Evidence-Based Medicine and Health Care, Catholic University of Croatia, Ilica 242, 10000, Zagreb, Croatia

Livia Puljak

Department of Epidemiology and Biostatistics, School of Public Health – Bloomington, Indiana University, Bloomington, IN, 47405, USA

David B. Allison

Departments of Paediatrics and Anaesthesia, McMaster University, Hamilton, ON, Canada

Lehana Thabane

Centre for Evaluation of Medicine, St. Joseph’s Healthcare-Hamilton, Hamilton, ON, Canada

Population Health Research Institute, Hamilton Health Sciences, Hamilton, ON, Canada

You can also search for this author in PubMed   Google Scholar


LM conceived the idea and drafted the outline and paper. DOL and LT commented on the idea and draft outline. LM, LP and DOL performed literature searches and data extraction. All authors (LM, DOL, LT, LP, DBA) reviewed several draft versions of the manuscript and approved the final manuscript.

Corresponding author

Correspondence to Lawrence Mbuagbaw .

Ethics declarations

Ethics approval and consent to participate.

Not applicable.

Consent for publication

Competing interests.

DOL, DBA, LM, LP and LT are involved in the development of a reporting guideline for methodological studies.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Mbuagbaw, L., Lawson, D.O., Puljak, L. et al. A tutorial on methodological studies: the what, when, how and why. BMC Med Res Methodol 20 , 226 (2020). https://doi.org/10.1186/s12874-020-01107-7

Download citation

Received : 27 May 2020

Accepted : 27 August 2020

Published : 07 September 2020

DOI : https://doi.org/10.1186/s12874-020-01107-7

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Methodological study
  • Meta-epidemiology
  • Research methods
  • Research-on-research

BMC Medical Research Methodology

ISSN: 1471-2288

analysis in research methodology

  • Privacy Policy

Research Method

Home » Research Methodology – Types, Examples and writing Guide

Research Methodology – Types, Examples and writing Guide

Table of Contents

Research Methodology

Research Methodology


Research Methodology refers to the systematic and scientific approach used to conduct research, investigate problems, and gather data and information for a specific purpose. It involves the techniques and procedures used to identify, collect , analyze , and interpret data to answer research questions or solve research problems . Moreover, They are philosophical and theoretical frameworks that guide the research process.

Structure of Research Methodology

Research methodology formats can vary depending on the specific requirements of the research project, but the following is a basic example of a structure for a research methodology section:

I. Introduction

  • Provide an overview of the research problem and the need for a research methodology section
  • Outline the main research questions and objectives

II. Research Design

  • Explain the research design chosen and why it is appropriate for the research question(s) and objectives
  • Discuss any alternative research designs considered and why they were not chosen
  • Describe the research setting and participants (if applicable)

III. Data Collection Methods

  • Describe the methods used to collect data (e.g., surveys, interviews, observations)
  • Explain how the data collection methods were chosen and why they are appropriate for the research question(s) and objectives
  • Detail any procedures or instruments used for data collection

IV. Data Analysis Methods

  • Describe the methods used to analyze the data (e.g., statistical analysis, content analysis )
  • Explain how the data analysis methods were chosen and why they are appropriate for the research question(s) and objectives
  • Detail any procedures or software used for data analysis

V. Ethical Considerations

  • Discuss any ethical issues that may arise from the research and how they were addressed
  • Explain how informed consent was obtained (if applicable)
  • Detail any measures taken to ensure confidentiality and anonymity

VI. Limitations

  • Identify any potential limitations of the research methodology and how they may impact the results and conclusions

VII. Conclusion

  • Summarize the key aspects of the research methodology section
  • Explain how the research methodology addresses the research question(s) and objectives

Research Methodology Types

Types of Research Methodology are as follows:

Quantitative Research Methodology

This is a research methodology that involves the collection and analysis of numerical data using statistical methods. This type of research is often used to study cause-and-effect relationships and to make predictions.

Qualitative Research Methodology

This is a research methodology that involves the collection and analysis of non-numerical data such as words, images, and observations. This type of research is often used to explore complex phenomena, to gain an in-depth understanding of a particular topic, and to generate hypotheses.

Mixed-Methods Research Methodology

This is a research methodology that combines elements of both quantitative and qualitative research. This approach can be particularly useful for studies that aim to explore complex phenomena and to provide a more comprehensive understanding of a particular topic.

Case Study Research Methodology

This is a research methodology that involves in-depth examination of a single case or a small number of cases. Case studies are often used in psychology, sociology, and anthropology to gain a detailed understanding of a particular individual or group.

Action Research Methodology

This is a research methodology that involves a collaborative process between researchers and practitioners to identify and solve real-world problems. Action research is often used in education, healthcare, and social work.

Experimental Research Methodology

This is a research methodology that involves the manipulation of one or more independent variables to observe their effects on a dependent variable. Experimental research is often used to study cause-and-effect relationships and to make predictions.

Survey Research Methodology

This is a research methodology that involves the collection of data from a sample of individuals using questionnaires or interviews. Survey research is often used to study attitudes, opinions, and behaviors.

Grounded Theory Research Methodology

This is a research methodology that involves the development of theories based on the data collected during the research process. Grounded theory is often used in sociology and anthropology to generate theories about social phenomena.

Research Methodology Example

An Example of Research Methodology could be the following:

Research Methodology for Investigating the Effectiveness of Cognitive Behavioral Therapy in Reducing Symptoms of Depression in Adults


The aim of this research is to investigate the effectiveness of cognitive-behavioral therapy (CBT) in reducing symptoms of depression in adults. To achieve this objective, a randomized controlled trial (RCT) will be conducted using a mixed-methods approach.

Research Design:

The study will follow a pre-test and post-test design with two groups: an experimental group receiving CBT and a control group receiving no intervention. The study will also include a qualitative component, in which semi-structured interviews will be conducted with a subset of participants to explore their experiences of receiving CBT.


Participants will be recruited from community mental health clinics in the local area. The sample will consist of 100 adults aged 18-65 years old who meet the diagnostic criteria for major depressive disorder. Participants will be randomly assigned to either the experimental group or the control group.

Intervention :

The experimental group will receive 12 weekly sessions of CBT, each lasting 60 minutes. The intervention will be delivered by licensed mental health professionals who have been trained in CBT. The control group will receive no intervention during the study period.

Data Collection:

Quantitative data will be collected through the use of standardized measures such as the Beck Depression Inventory-II (BDI-II) and the Generalized Anxiety Disorder-7 (GAD-7). Data will be collected at baseline, immediately after the intervention, and at a 3-month follow-up. Qualitative data will be collected through semi-structured interviews with a subset of participants from the experimental group. The interviews will be conducted at the end of the intervention period, and will explore participants’ experiences of receiving CBT.

Data Analysis:

Quantitative data will be analyzed using descriptive statistics, t-tests, and mixed-model analyses of variance (ANOVA) to assess the effectiveness of the intervention. Qualitative data will be analyzed using thematic analysis to identify common themes and patterns in participants’ experiences of receiving CBT.

Ethical Considerations:

This study will comply with ethical guidelines for research involving human subjects. Participants will provide informed consent before participating in the study, and their privacy and confidentiality will be protected throughout the study. Any adverse events or reactions will be reported and managed appropriately.

Data Management:

All data collected will be kept confidential and stored securely using password-protected databases. Identifying information will be removed from qualitative data transcripts to ensure participants’ anonymity.


One potential limitation of this study is that it only focuses on one type of psychotherapy, CBT, and may not generalize to other types of therapy or interventions. Another limitation is that the study will only include participants from community mental health clinics, which may not be representative of the general population.


This research aims to investigate the effectiveness of CBT in reducing symptoms of depression in adults. By using a randomized controlled trial and a mixed-methods approach, the study will provide valuable insights into the mechanisms underlying the relationship between CBT and depression. The results of this study will have important implications for the development of effective treatments for depression in clinical settings.

How to Write Research Methodology

Writing a research methodology involves explaining the methods and techniques you used to conduct research, collect data, and analyze results. It’s an essential section of any research paper or thesis, as it helps readers understand the validity and reliability of your findings. Here are the steps to write a research methodology:

  • Start by explaining your research question: Begin the methodology section by restating your research question and explaining why it’s important. This helps readers understand the purpose of your research and the rationale behind your methods.
  • Describe your research design: Explain the overall approach you used to conduct research. This could be a qualitative or quantitative research design, experimental or non-experimental, case study or survey, etc. Discuss the advantages and limitations of the chosen design.
  • Discuss your sample: Describe the participants or subjects you included in your study. Include details such as their demographics, sampling method, sample size, and any exclusion criteria used.
  • Describe your data collection methods : Explain how you collected data from your participants. This could include surveys, interviews, observations, questionnaires, or experiments. Include details on how you obtained informed consent, how you administered the tools, and how you minimized the risk of bias.
  • Explain your data analysis techniques: Describe the methods you used to analyze the data you collected. This could include statistical analysis, content analysis, thematic analysis, or discourse analysis. Explain how you dealt with missing data, outliers, and any other issues that arose during the analysis.
  • Discuss the validity and reliability of your research : Explain how you ensured the validity and reliability of your study. This could include measures such as triangulation, member checking, peer review, or inter-coder reliability.
  • Acknowledge any limitations of your research: Discuss any limitations of your study, including any potential threats to validity or generalizability. This helps readers understand the scope of your findings and how they might apply to other contexts.
  • Provide a summary: End the methodology section by summarizing the methods and techniques you used to conduct your research. This provides a clear overview of your research methodology and helps readers understand the process you followed to arrive at your findings.

When to Write Research Methodology

Research methodology is typically written after the research proposal has been approved and before the actual research is conducted. It should be written prior to data collection and analysis, as it provides a clear roadmap for the research project.

The research methodology is an important section of any research paper or thesis, as it describes the methods and procedures that will be used to conduct the research. It should include details about the research design, data collection methods, data analysis techniques, and any ethical considerations.

The methodology should be written in a clear and concise manner, and it should be based on established research practices and standards. It is important to provide enough detail so that the reader can understand how the research was conducted and evaluate the validity of the results.

Applications of Research Methodology

Here are some of the applications of research methodology:

  • To identify the research problem: Research methodology is used to identify the research problem, which is the first step in conducting any research.
  • To design the research: Research methodology helps in designing the research by selecting the appropriate research method, research design, and sampling technique.
  • To collect data: Research methodology provides a systematic approach to collect data from primary and secondary sources.
  • To analyze data: Research methodology helps in analyzing the collected data using various statistical and non-statistical techniques.
  • To test hypotheses: Research methodology provides a framework for testing hypotheses and drawing conclusions based on the analysis of data.
  • To generalize findings: Research methodology helps in generalizing the findings of the research to the target population.
  • To develop theories : Research methodology is used to develop new theories and modify existing theories based on the findings of the research.
  • To evaluate programs and policies : Research methodology is used to evaluate the effectiveness of programs and policies by collecting data and analyzing it.
  • To improve decision-making: Research methodology helps in making informed decisions by providing reliable and valid data.

Purpose of Research Methodology

Research methodology serves several important purposes, including:

  • To guide the research process: Research methodology provides a systematic framework for conducting research. It helps researchers to plan their research, define their research questions, and select appropriate methods and techniques for collecting and analyzing data.
  • To ensure research quality: Research methodology helps researchers to ensure that their research is rigorous, reliable, and valid. It provides guidelines for minimizing bias and error in data collection and analysis, and for ensuring that research findings are accurate and trustworthy.
  • To replicate research: Research methodology provides a clear and detailed account of the research process, making it possible for other researchers to replicate the study and verify its findings.
  • To advance knowledge: Research methodology enables researchers to generate new knowledge and to contribute to the body of knowledge in their field. It provides a means for testing hypotheses, exploring new ideas, and discovering new insights.
  • To inform decision-making: Research methodology provides evidence-based information that can inform policy and decision-making in a variety of fields, including medicine, public health, education, and business.

Advantages of Research Methodology

Research methodology has several advantages that make it a valuable tool for conducting research in various fields. Here are some of the key advantages of research methodology:

  • Systematic and structured approach : Research methodology provides a systematic and structured approach to conducting research, which ensures that the research is conducted in a rigorous and comprehensive manner.
  • Objectivity : Research methodology aims to ensure objectivity in the research process, which means that the research findings are based on evidence and not influenced by personal bias or subjective opinions.
  • Replicability : Research methodology ensures that research can be replicated by other researchers, which is essential for validating research findings and ensuring their accuracy.
  • Reliability : Research methodology aims to ensure that the research findings are reliable, which means that they are consistent and can be depended upon.
  • Validity : Research methodology ensures that the research findings are valid, which means that they accurately reflect the research question or hypothesis being tested.
  • Efficiency : Research methodology provides a structured and efficient way of conducting research, which helps to save time and resources.
  • Flexibility : Research methodology allows researchers to choose the most appropriate research methods and techniques based on the research question, data availability, and other relevant factors.
  • Scope for innovation: Research methodology provides scope for innovation and creativity in designing research studies and developing new research techniques.

Research Methodology Vs Research Methods

About the author.

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Research Paper Citation

How to Cite Research Paper – All Formats and...

Data collection

Data Collection – Methods Types and Examples


Delimitations in Research – Types, Examples and...

Research Paper Formats

Research Paper Format – Types, Examples and...

Research Process

Research Process – Steps, Examples and Tips

Research Design

Research Design – Types, Methods and Examples

Grad Coach

What Is Research Methodology? A Plain-Language Explanation & Definition (With Examples)

By Derek Jansen (MBA)  and Kerryn Warren (PhD) | June 2020 (Last updated April 2023)

If you’re new to formal academic research, it’s quite likely that you’re feeling a little overwhelmed by all the technical lingo that gets thrown around. And who could blame you – “research methodology”, “research methods”, “sampling strategies”… it all seems never-ending!

In this post, we’ll demystify the landscape with plain-language explanations and loads of examples (including easy-to-follow videos), so that you can approach your dissertation, thesis or research project with confidence. Let’s get started.

Research Methodology 101

  • What exactly research methodology means
  • What qualitative , quantitative and mixed methods are
  • What sampling strategy is
  • What data collection methods are
  • What data analysis methods are
  • How to choose your research methodology
  • Example of a research methodology

Free Webinar: Research Methodology 101

What is research methodology?

Research methodology simply refers to the practical “how” of a research study. More specifically, it’s about how  a researcher  systematically designs a study  to ensure valid and reliable results that address the research aims, objectives and research questions . Specifically, how the researcher went about deciding:

  • What type of data to collect (e.g., qualitative or quantitative data )
  • Who  to collect it from (i.e., the sampling strategy )
  • How to  collect  it (i.e., the data collection method )
  • How to  analyse  it (i.e., the data analysis methods )

Within any formal piece of academic research (be it a dissertation, thesis or journal article), you’ll find a research methodology chapter or section which covers the aspects mentioned above. Importantly, a good methodology chapter explains not just   what methodological choices were made, but also explains  why they were made. In other words, the methodology chapter should justify  the design choices, by showing that the chosen methods and techniques are the best fit for the research aims, objectives and research questions. 

So, it’s the same as research design?

Not quite. As we mentioned, research methodology refers to the collection of practical decisions regarding what data you’ll collect, from who, how you’ll collect it and how you’ll analyse it. Research design, on the other hand, is more about the overall strategy you’ll adopt in your study. For example, whether you’ll use an experimental design in which you manipulate one variable while controlling others. You can learn more about research design and the various design types here .

Need a helping hand?

analysis in research methodology

What are qualitative, quantitative and mixed-methods?

Qualitative, quantitative and mixed-methods are different types of methodological approaches, distinguished by their focus on words , numbers or both . This is a bit of an oversimplification, but its a good starting point for understanding.

Let’s take a closer look.

Qualitative research refers to research which focuses on collecting and analysing words (written or spoken) and textual or visual data, whereas quantitative research focuses on measurement and testing using numerical data . Qualitative analysis can also focus on other “softer” data points, such as body language or visual elements.

It’s quite common for a qualitative methodology to be used when the research aims and research questions are exploratory  in nature. For example, a qualitative methodology might be used to understand peoples’ perceptions about an event that took place, or a political candidate running for president. 

Contrasted to this, a quantitative methodology is typically used when the research aims and research questions are confirmatory  in nature. For example, a quantitative methodology might be used to measure the relationship between two variables (e.g. personality type and likelihood to commit a crime) or to test a set of hypotheses .

As you’ve probably guessed, the mixed-method methodology attempts to combine the best of both qualitative and quantitative methodologies to integrate perspectives and create a rich picture. If you’d like to learn more about these three methodological approaches, be sure to watch our explainer video below.

What is sampling strategy?

Simply put, sampling is about deciding who (or where) you’re going to collect your data from . Why does this matter? Well, generally it’s not possible to collect data from every single person in your group of interest (this is called the “population”), so you’ll need to engage a smaller portion of that group that’s accessible and manageable (this is called the “sample”).

How you go about selecting the sample (i.e., your sampling strategy) will have a major impact on your study.  There are many different sampling methods  you can choose from, but the two overarching categories are probability   sampling and  non-probability   sampling .

Probability sampling  involves using a completely random sample from the group of people you’re interested in. This is comparable to throwing the names all potential participants into a hat, shaking it up, and picking out the “winners”. By using a completely random sample, you’ll minimise the risk of selection bias and the results of your study will be more generalisable  to the entire population. 

Non-probability sampling , on the other hand,  doesn’t use a random sample . For example, it might involve using a convenience sample, which means you’d only interview or survey people that you have access to (perhaps your friends, family or work colleagues), rather than a truly random sample. With non-probability sampling, the results are typically not generalisable .

To learn more about sampling methods, be sure to check out the video below.

What are data collection methods?

As the name suggests, data collection methods simply refers to the way in which you go about collecting the data for your study. Some of the most common data collection methods include:

  • Interviews (which can be unstructured, semi-structured or structured)
  • Focus groups and group interviews
  • Surveys (online or physical surveys)
  • Observations (watching and recording activities)
  • Biophysical measurements (e.g., blood pressure, heart rate, etc.)
  • Documents and records (e.g., financial reports, court records, etc.)

The choice of which data collection method to use depends on your overall research aims and research questions , as well as practicalities and resource constraints. For example, if your research is exploratory in nature, qualitative methods such as interviews and focus groups would likely be a good fit. Conversely, if your research aims to measure specific variables or test hypotheses, large-scale surveys that produce large volumes of numerical data would likely be a better fit.

What are data analysis methods?

Data analysis methods refer to the methods and techniques that you’ll use to make sense of your data. These can be grouped according to whether the research is qualitative  (words-based) or quantitative (numbers-based).

Popular data analysis methods in qualitative research include:

  • Qualitative content analysis
  • Thematic analysis
  • Discourse analysis
  • Narrative analysis
  • Interpretative phenomenological analysis (IPA)
  • Visual analysis (of photographs, videos, art, etc.)

Qualitative data analysis all begins with data coding , after which an analysis method is applied. In some cases, more than one analysis method is used, depending on the research aims and research questions . In the video below, we explore some  common qualitative analysis methods, along with practical examples.  

Moving on to the quantitative side of things, popular data analysis methods in this type of research include:

  • Descriptive statistics (e.g. means, medians, modes )
  • Inferential statistics (e.g. correlation, regression, structural equation modelling)

Again, the choice of which data collection method to use depends on your overall research aims and objectives , as well as practicalities and resource constraints. In the video below, we explain some core concepts central to quantitative analysis.

How do I choose a research methodology?

As you’ve probably picked up by now, your research aims and objectives have a major influence on the research methodology . So, the starting point for developing your research methodology is to take a step back and look at the big picture of your research, before you make methodology decisions. The first question you need to ask yourself is whether your research is exploratory or confirmatory in nature.

If your research aims and objectives are primarily exploratory in nature, your research will likely be qualitative and therefore you might consider qualitative data collection methods (e.g. interviews) and analysis methods (e.g. qualitative content analysis). 

Conversely, if your research aims and objective are looking to measure or test something (i.e. they’re confirmatory), then your research will quite likely be quantitative in nature, and you might consider quantitative data collection methods (e.g. surveys) and analyses (e.g. statistical analysis).

Designing your research and working out your methodology is a large topic, which we cover extensively on the blog . For now, however, the key takeaway is that you should always start with your research aims, objectives and research questions (the golden thread). Every methodological choice you make needs align with those three components. 

Example of a research methodology chapter

In the video below, we provide a detailed walkthrough of a research methodology from an actual dissertation, as well as an overview of our free methodology template .

analysis in research methodology

Psst... there’s more!

This post was based on one of our popular Research Bootcamps . If you're working on a research project, you'll definitely want to check this out ...

You Might Also Like:

Inferential stats 101


Leo Balanlay

Thank you for this simple yet comprehensive and easy to digest presentation. God Bless!

Derek Jansen

You’re most welcome, Leo. Best of luck with your research!


I found it very useful. many thanks

Solomon F. Joel

This is really directional. A make-easy research knowledge.

Upendo Mmbaga

Thank you for this, I think will help my research proposal


Thanks for good interpretation,well understood.

Alhaji Alie Kanu

Good morning sorry I want to the search topic

Baraka Gombela

Thank u more


Thank you, your explanation is simple and very helpful.

Suleiman Abubakar

Very educative a.nd exciting platform. A bigger thank you and I’ll like to always be with you

Daniel Mondela

That’s the best analysis


So simple yet so insightful. Thank you.

Wendy Lushaba

This really easy to read as it is self-explanatory. Very much appreciated…


Thanks for this. It’s so helpful and explicit. For those elements highlighted in orange, they were good sources of referrals for concepts I didn’t understand. A million thanks for this.

Tabe Solomon Matebesi

Good morning, I have been reading your research lessons through out a period of times. They are important, impressive and clear. Want to subscribe and be and be active with you.

Hafiz Tahir

Thankyou So much Sir Derek…

Good morning thanks so much for the on line lectures am a student of university of Makeni.select a research topic and deliberate on it so that we’ll continue to understand more.sorry that’s a suggestion.

James Olukoya

Beautiful presentation. I love it.


please provide a research mehodology example for zoology

Ogar , Praise

It’s very educative and well explained

Joseph Chan

Thanks for the concise and informative data.

Goja Terhemba John

This is really good for students to be safe and well understand that research is all about

Prakash thapa

Thank you so much Derek sir🖤🙏🤗


Very simple and reliable

Chizor Adisa

This is really helpful. Thanks alot. God bless you.


very useful, Thank you very much..

nakato justine

thanks a lot its really useful


in a nutshell..thank you!


Thanks for updating my understanding on this aspect of my Thesis writing.


thank you so much my through this video am competently going to do a good job my thesis


Thanks a lot. Very simple to understand. I appreciate 🙏


Very simple but yet insightful Thank you

Adegboyega ADaeBAYO

This has been an eye opening experience. Thank you grad coach team.


Very useful message for research scholars


Really very helpful thank you


yes you are right and i’m left


Research methodology with a simplest way i have never seen before this article.

wogayehu tuji

wow thank u so much

Good morning thanks so much for the on line lectures am a student of university of Makeni.select a research topic and deliberate on is so that we will continue to understand more.sorry that’s a suggestion.


Very precise and informative.

Javangwe Nyeketa

Thanks for simplifying these terms for us, really appreciate it.

Mary Benard Mwanganya

Thanks this has really helped me. It is very easy to understand.


I found the notes and the presentation assisting and opening my understanding on research methodology

Godfrey Martin Assenga

Good presentation

Nhubu Tawanda

Im so glad you clarified my misconceptions. Im now ready to fry my onions. Thank you so much. God bless


Thank you a lot.


thanks for the easy way of learning and desirable presentation.

Ajala Tajudeen

Thanks a lot. I am inspired

Visor Likali

Well written

Pondris Patrick

I am writing a APA Format paper . I using questionnaire with 120 STDs teacher for my participant. Can you write me mthology for this research. Send it through email sent. Just need a sample as an example please. My topic is ” impacts of overcrowding on students learning

Thanks for your comment.

We can’t write your methodology for you. If you’re looking for samples, you should be able to find some sample methodologies on Google. Alternatively, you can download some previous dissertations from a dissertation directory and have a look at the methodology chapters therein.

All the best with your research.


Thank you so much for this!! God Bless


Thank you. Explicit explanation


Thank you, Derek and Kerryn, for making this simple to understand. I’m currently at the inception stage of my research.


Thnks a lot , this was very usefull on my assignment

Beulah Emmanuel

excellent explanation

Gino Raz

I’m currently working on my master’s thesis, thanks for this! I’m certain that I will use Qualitative methodology.


Thanks a lot for this concise piece, it was quite relieving and helpful. God bless you BIG…

Yonas Tesheme

I am currently doing my dissertation proposal and I am sure that I will do quantitative research. Thank you very much it was extremely helpful.

zahid t ahmad

Very interesting and informative yet I would like to know about examples of Research Questions as well, if possible.

Maisnam loyalakla

I’m about to submit a research presentation, I have come to understand from your simplification on understanding research methodology. My research will be mixed methodology, qualitative as well as quantitative. So aim and objective of mixed method would be both exploratory and confirmatory. Thanks you very much for your guidance.

Mila Milano

OMG thanks for that, you’re a life saver. You covered all the points I needed. Thank you so much ❤️ ❤️ ❤️


Thank you immensely for this simple, easy to comprehend explanation of data collection methods. I have been stuck here for months 😩. Glad I found your piece. Super insightful.


I’m going to write synopsis which will be quantitative research method and I don’t know how to frame my topic, can I kindly get some ideas..


Thanks for this, I was really struggling.

This was really informative I was struggling but this helped me.

Modie Maria Neswiswi

Thanks a lot for this information, simple and straightforward. I’m a last year student from the University of South Africa UNISA South Africa.

Mursel Amin

its very much informative and understandable. I have enlightened.

Mustapha Abubakar

An interesting nice exploration of a topic.


Thank you. Accurate and simple🥰

Sikandar Ali Shah

This article was really helpful, it helped me understanding the basic concepts of the topic Research Methodology. The examples were very clear, and easy to understand. I would like to visit this website again. Thank you so much for such a great explanation of the subject.


Thanks dude


Thank you Doctor Derek for this wonderful piece, please help to provide your details for reference purpose. God bless.


Many compliments to you


Great work , thank you very much for the simple explanation


Thank you. I had to give a presentation on this topic. I have looked everywhere on the internet but this is the best and simple explanation.

omodara beatrice

thank you, its very informative.


Well explained. Now I know my research methodology will be qualitative and exploratory. Thank you so much, keep up the good work


Well explained, thank you very much.

Ainembabazi Rose

This is good explanation, I have understood the different methods of research. Thanks a lot.

Kamran Saeed

Great work…very well explanation

Hyacinth Chebe Ukwuani

Thanks Derek. Kerryn was just fantastic!

Great to hear that, Hyacinth. Best of luck with your research!

Matobela Joel Marabi

Its a good templates very attractive and important to PhD students and lectuter

Thanks for the feedback, Matobela. Good luck with your research methodology.


Thank you. This is really helpful.

You’re very welcome, Elie. Good luck with your research methodology.

Sakina Dalal

Well explained thanks


This is a very helpful site especially for young researchers at college. It provides sufficient information to guide students and equip them with the necessary foundation to ask any other questions aimed at deepening their understanding.

Thanks for the kind words, Edward. Good luck with your research!

Ngwisa Marie-claire NJOTU

Thank you. I have learned a lot.

Great to hear that, Ngwisa. Good luck with your research methodology!


Thank you for keeping your presentation simples and short and covering key information for research methodology. My key takeaway: Start with defining your research objective the other will depend on the aims of your research question.


My name is Zanele I would like to be assisted with my research , and the topic is shortage of nursing staff globally want are the causes , effects on health, patients and community and also globally

Oluwafemi Taiwo

Thanks for making it simple and clear. It greatly helped in understanding research methodology. Regards.


This is well simplified and straight to the point

Gabriel mugangavari

Thank you Dr

Dina Haj Ibrahim

I was given an assignment to research 2 publications and describe their research methodology? I don’t know how to start this task can someone help me?

Sure. You’re welcome to book an initial consultation with one of our Research Coaches to discuss how we can assist – https://gradcoach.com/book/new/ .


Thanks a lot I am relieved of a heavy burden.keep up with the good work

Ngaka Mokoena

I’m very much grateful Dr Derek. I’m planning to pursue one of the careers that really needs one to be very much eager to know. There’s a lot of research to do and everything, but since I’ve gotten this information I will use it to the best of my potential.

Pritam Pal

Thank you so much, words are not enough to explain how helpful this session has been for me!


Thanks this has thought me alot.

kenechukwu ambrose

Very concise and helpful. Thanks a lot

Eunice Shatila Sinyemu 32070

Thank Derek. This is very helpful. Your step by step explanation has made it easier for me to understand different concepts. Now i can get on with my research.


I wish i had come across this sooner. So simple but yet insightful

yugine the

really nice explanation thank you so much


I’m so grateful finding this site, it’s really helpful…….every term well explained and provide accurate understanding especially to student going into an in-depth research for the very first time, even though my lecturer already explained this topic to the class, I think I got the clear and efficient explanation here, much thanks to the author.


It is very helpful material

Lubabalo Ntshebe

I would like to be assisted with my research topic : Literature Review and research methodologies. My topic is : what is the relationship between unemployment and economic growth?


Its really nice and good for us.

Ekokobe Aloysius



Short but sweet.Thank you

Shishir Pokharel

Informative article. Thanks for your detailed information.

Badr Alharbi

I’m currently working on my Ph.D. thesis. Thanks a lot, Derek and Kerryn, Well-organized sequences, facilitate the readers’ following.


great article for someone who does not have any background can even understand

Hasan Chowdhury

I am a bit confused about research design and methodology. Are they the same? If not, what are the differences and how are they related?

Thanks in advance.

Ndileka Myoli

concise and informative.

Sureka Batagoda

Thank you very much

More Smith

How can we site this article is Harvard style?


Very well written piece that afforded better understanding of the concept. Thank you!

Denis Eken Lomoro

Am a new researcher trying to learn how best to write a research proposal. I find your article spot on and want to download the free template but finding difficulties. Can u kindly send it to my email, the free download entitled, “Free Download: Research Proposal Template (with Examples)”.

fatima sani

Thank too much


Thank you very much for your comprehensive explanation about research methodology so I like to thank you again for giving us such great things.

Aqsa Iftijhar

Good very well explained.Thanks for sharing it.

Krishna Dhakal

Thank u sir, it is really a good guideline.


so helpful thank you very much.

Joelma M Monteiro

Thanks for the video it was very explanatory and detailed, easy to comprehend and follow up. please, keep it up the good work


It was very helpful, a well-written document with precise information.

orebotswe morokane

how do i reference this?


MLA Jansen, Derek, and Kerryn Warren. “What (Exactly) Is Research Methodology?” Grad Coach, June 2021, gradcoach.com/what-is-research-methodology/.

APA Jansen, D., & Warren, K. (2021, June). What (Exactly) Is Research Methodology? Grad Coach. https://gradcoach.com/what-is-research-methodology/


Your explanation is easily understood. Thank you

Dr Christie

Very help article. Now I can go my methodology chapter in my thesis with ease

Alice W. Mbuthia

I feel guided ,Thank you

Joseph B. Smith

This simplification is very helpful. It is simple but very educative, thanks ever so much

Dr. Ukpai Ukpai Eni

The write up is informative and educative. It is an academic intellectual representation that every good researcher can find useful. Thanks

chimbini Joseph

Wow, this is wonderful long live.


Nice initiative


thank you the video was helpful to me.


Thank you very much for your simple and clear explanations I’m really satisfied by the way you did it By now, I think I can realize a very good article by following your fastidious indications May God bless you


Thanks very much, it was very concise and informational for a beginner like me to gain an insight into what i am about to undertake. I really appreciate.

Adv Asad Ali

very informative sir, it is amazing to understand the meaning of question hidden behind that, and simple language is used other than legislature to understand easily. stay happy.

Jonas Tan

This one is really amazing. All content in your youtube channel is a very helpful guide for doing research. Thanks, GradCoach.

mahmoud ali

research methodologies

Lucas Sinyangwe

Please send me more information concerning dissertation research.

Amamten Jr.

Nice piece of knowledge shared….. #Thump_UP

Hajara Salihu

This is amazing, it has said it all. Thanks to Gradcoach

Gerald Andrew Babu

This is wonderful,very elaborate and clear.I hope to reach out for your assistance in my research very soon.


This is the answer I am searching about…

realy thanks a lot

Ahmed Saeed

Thank you very much for this awesome, to the point and inclusive article.

Soraya Kolli

Thank you very much I need validity and reliability explanation I have exams


Thank you for a well explained piece. This will help me going forward.

Emmanuel Chukwuma

Very simple and well detailed Many thanks

Zeeshan Ali Khan

This is so very simple yet so very effective and comprehensive. An Excellent piece of work.

Molly Wasonga

I wish I saw this earlier on! Great insights for a beginner(researcher) like me. Thanks a mil!

Blessings Chigodo

Thank you very much, for such a simplified, clear and practical step by step both for academic students and general research work. Holistic, effective to use and easy to read step by step. One can easily apply the steps in practical terms and produce a quality document/up-to standard

Thanks for simplifying these terms for us, really appreciated.

Joseph Kyereme

Thanks for a great work. well understood .


This was very helpful. It was simple but profound and very easy to understand. Thank you so much!


Great and amazing research guidelines. Best site for learning research

ankita bhatt

hello sir/ma’am, i didn’t find yet that what type of research methodology i am using. because i am writing my report on CSR and collect all my data from websites and articles so which type of methodology i should write in dissertation report. please help me. i am from India.


how does this really work?

princelow presley

perfect content, thanks a lot

George Nangpaak Duut

As a researcher, I commend you for the detailed and simplified information on the topic in question. I would like to remain in touch for the sharing of research ideas on other topics. Thank you


Impressive. Thank you, Grad Coach 😍

Thank you Grad Coach for this piece of information. I have at least learned about the different types of research methodologies.

Varinder singh Rana

Very useful content with easy way

Mbangu Jones Kashweeka

Thank you very much for the presentation. I am an MPH student with the Adventist University of Africa. I have successfully completed my theory and starting on my research this July. My topic is “Factors associated with Dental Caries in (one District) in Botswana. I need help on how to go about this quantitative research

Carolyn Russell

I am so grateful to run across something that was sooo helpful. I have been on my doctorate journey for quite some time. Your breakdown on methodology helped me to refresh my intent. Thank you.

Indabawa Musbahu

thanks so much for this good lecture. student from university of science and technology, Wudil. Kano Nigeria.

Limpho Mphutlane

It’s profound easy to understand I appreciate

Mustafa Salimi

Thanks a lot for sharing superb information in a detailed but concise manner. It was really helpful and helped a lot in getting into my own research methodology.

Rabilu yau

Comment * thanks very much

Ari M. Hussein

This was sooo helpful for me thank you so much i didn’t even know what i had to write thank you!

You’re most welcome 🙂

Varsha Patnaik

Simple and good. Very much helpful. Thank you so much.


This is very good work. I have benefited.

Dr Md Asraul Hoque

Thank you so much for sharing

Nkasa lizwi

This is powerful thank you so much guys

I am nkasa lizwi doing my research proposal on honors with the university of Walter Sisulu Komani I m on part 3 now can you assist me.my topic is: transitional challenges faced by educators in intermediate phase in the Alfred Nzo District.

Atonisah Jonathan

Appreciate the presentation. Very useful step-by-step guidelines to follow.

Bello Suleiman

I appreciate sir


wow! This is super insightful for me. Thank you!

Emerita Guzman

Indeed this material is very helpful! Kudos writers/authors.


I want to say thank you very much, I got a lot of info and knowledge. Be blessed.

Akanji wasiu

I want present a seminar paper on Optimisation of Deep learning-based models on vulnerability detection in digital transactions.

Need assistance

Clement Lokwar

Dear Sir, I want to be assisted on my research on Sanitation and Water management in emergencies areas.

Peter Sone Kome

I am deeply grateful for the knowledge gained. I will be getting in touch shortly as I want to be assisted in my ongoing research.


The information shared is informative, crisp and clear. Kudos Team! And thanks a lot!

Bipin pokhrel

hello i want to study


Hello!! Grad coach teams. I am extremely happy in your tutorial or consultation. i am really benefited all material and briefing. Thank you very much for your generous helps. Please keep it up. If you add in your briefing, references for further reading, it will be very nice.


All I have to say is, thank u gyz.


Good, l thanks

Artak Ghonyan

thank you, it is very useful


  • What Is A Literature Review (In A Dissertation Or Thesis) - Grad Coach - […] the literature review is to inform the choice of methodology for your own research. As we’ve discussed on the Grad Coach blog,…
  • Free Download: Research Proposal Template (With Examples) - Grad Coach - […] Research design (methodology) […]
  • Dissertation vs Thesis: What's the difference? - Grad Coach - […] and thesis writing on a daily basis – everything from how to find a good research topic to which…

Submit a Comment Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

  • Print Friendly

Your Modern Business Guide To Data Analysis Methods And Techniques

Data analysis methods and techniques blog post by datapine

Table of Contents

1) What Is Data Analysis?

2) Why Is Data Analysis Important?

3) What Is The Data Analysis Process?

4) Types Of Data Analysis Methods

5) Top Data Analysis Techniques To Apply

6) Quality Criteria For Data Analysis

7) Data Analysis Limitations & Barriers

8) Data Analysis Skills

9) Data Analysis In The Big Data Environment

In our data-rich age, understanding how to analyze and extract true meaning from our business’s digital insights is one of the primary drivers of success.

Despite the colossal volume of data we create every day, a mere 0.5% is actually analyzed and used for data discovery , improvement, and intelligence. While that may not seem like much, considering the amount of digital information we have at our fingertips, half a percent still accounts for a vast amount of data.

With so much data and so little time, knowing how to collect, curate, organize, and make sense of all of this potentially business-boosting information can be a minefield – but online data analysis is the solution.

In science, data analysis uses a more complex approach with advanced techniques to explore and experiment with data. On the other hand, in a business context, data is used to make data-driven decisions that will enable the company to improve its overall performance. In this post, we will cover the analysis of data from an organizational point of view while still going through the scientific and statistical foundations that are fundamental to understanding the basics of data analysis. 

To put all of that into perspective, we will answer a host of important analytical questions, explore analytical methods and techniques, while demonstrating how to perform analysis in the real world with a 17-step blueprint for success.

What Is Data Analysis?

Data analysis is the process of collecting, modeling, and analyzing data using various statistical and logical methods and techniques. Businesses rely on analytics processes and tools to extract insights that support strategic and operational decision-making.

All these various methods are largely based on two core areas: quantitative and qualitative research.

To explain the key differences between qualitative and quantitative research, here’s a video for your viewing pleasure:

Gaining a better understanding of different techniques and methods in quantitative research as well as qualitative insights will give your analyzing efforts a more clearly defined direction, so it’s worth taking the time to allow this particular knowledge to sink in. Additionally, you will be able to create a comprehensive analytical report that will skyrocket your analysis.

Apart from qualitative and quantitative categories, there are also other types of data that you should be aware of before dividing into complex data analysis processes. These categories include: 

  • Big data: Refers to massive data sets that need to be analyzed using advanced software to reveal patterns and trends. It is considered to be one of the best analytical assets as it provides larger volumes of data at a faster rate. 
  • Metadata: Putting it simply, metadata is data that provides insights about other data. It summarizes key information about specific data that makes it easier to find and reuse for later purposes. 
  • Real time data: As its name suggests, real time data is presented as soon as it is acquired. From an organizational perspective, this is the most valuable data as it can help you make important decisions based on the latest developments. Our guide on real time analytics will tell you more about the topic. 
  • Machine data: This is more complex data that is generated solely by a machine such as phones, computers, or even websites and embedded systems, without previous human interaction.

Why Is Data Analysis Important?

Before we go into detail about the categories of analysis along with its methods and techniques, you must understand the potential that analyzing data can bring to your organization.

  • Informed decision-making : From a management perspective, you can benefit from analyzing your data as it helps you make decisions based on facts and not simple intuition. For instance, you can understand where to invest your capital, detect growth opportunities, predict your income, or tackle uncommon situations before they become problems. Through this, you can extract relevant insights from all areas in your organization, and with the help of dashboard software , present the data in a professional and interactive way to different stakeholders.
  • Reduce costs : Another great benefit is to reduce costs. With the help of advanced technologies such as predictive analytics, businesses can spot improvement opportunities, trends, and patterns in their data and plan their strategies accordingly. In time, this will help you save money and resources on implementing the wrong strategies. And not just that, by predicting different scenarios such as sales and demand you can also anticipate production and supply. 
  • Target customers better : Customers are arguably the most crucial element in any business. By using analytics to get a 360° vision of all aspects related to your customers, you can understand which channels they use to communicate with you, their demographics, interests, habits, purchasing behaviors, and more. In the long run, it will drive success to your marketing strategies, allow you to identify new potential customers, and avoid wasting resources on targeting the wrong people or sending the wrong message. You can also track customer satisfaction by analyzing your client’s reviews or your customer service department’s performance.

What Is The Data Analysis Process?

Data analysis process graphic

When we talk about analyzing data there is an order to follow in order to extract the needed conclusions. The analysis process consists of 5 key stages. We will cover each of them more in detail later in the post, but to start providing the needed context to understand what is coming next, here is a rundown of the 5 essential steps of data analysis. 

  • Identify: Before you get your hands dirty with data, you first need to identify why you need it in the first place. The identification is the stage in which you establish the questions you will need to answer. For example, what is the customer's perception of our brand? Or what type of packaging is more engaging to our potential customers? Once the questions are outlined you are ready for the next step. 
  • Collect: As its name suggests, this is the stage where you start collecting the needed data. Here, you define which sources of data you will use and how you will use them. The collection of data can come in different forms such as internal or external sources, surveys, interviews, questionnaires, and focus groups, among others.  An important note here is that the way you collect the data will be different in a quantitative and qualitative scenario. 
  • Clean: Once you have the necessary data it is time to clean it and leave it ready for analysis. Not all the data you collect will be useful, when collecting big amounts of data in different formats it is very likely that you will find yourself with duplicate or badly formatted data. To avoid this, before you start working with your data you need to make sure to erase any white spaces, duplicate records, or formatting errors. This way you avoid hurting your analysis with bad-quality data. 
  • Analyze : With the help of various techniques such as statistical analysis, regressions, neural networks, text analysis, and more, you can start analyzing and manipulating your data to extract relevant conclusions. At this stage, you find trends, correlations, variations, and patterns that can help you answer the questions you first thought of in the identify stage. Various technologies in the market assist researchers and average users with the management of their data. Some of them include business intelligence and visualization software, predictive analytics, and data mining, among others. 
  • Interpret: Last but not least you have one of the most important steps: it is time to interpret your results. This stage is where the researcher comes up with courses of action based on the findings. For example, here you would understand if your clients prefer packaging that is red or green, plastic or paper, etc. Additionally, at this stage, you can also find some limitations and work on them. 

Now that you have a basic understanding of the key data analysis steps, let’s look at the top 17 essential methods.

17 Essential Types Of Data Analysis Methods

Before diving into the 17 essential types of methods, it is important that we go over really fast through the main analysis categories. Starting with the category of descriptive up to prescriptive analysis, the complexity and effort of data evaluation increases, but also the added value for the company.

a) Descriptive analysis - What happened.

The descriptive analysis method is the starting point for any analytic reflection, and it aims to answer the question of what happened? It does this by ordering, manipulating, and interpreting raw data from various sources to turn it into valuable insights for your organization.

Performing descriptive analysis is essential, as it enables us to present our insights in a meaningful way. Although it is relevant to mention that this analysis on its own will not allow you to predict future outcomes or tell you the answer to questions like why something happened, it will leave your data organized and ready to conduct further investigations.

b) Exploratory analysis - How to explore data relationships.

As its name suggests, the main aim of the exploratory analysis is to explore. Prior to it, there is still no notion of the relationship between the data and the variables. Once the data is investigated, exploratory analysis helps you to find connections and generate hypotheses and solutions for specific problems. A typical area of ​​application for it is data mining.

c) Diagnostic analysis - Why it happened.

Diagnostic data analytics empowers analysts and executives by helping them gain a firm contextual understanding of why something happened. If you know why something happened as well as how it happened, you will be able to pinpoint the exact ways of tackling the issue or challenge.

Designed to provide direct and actionable answers to specific questions, this is one of the world’s most important methods in research, among its other key organizational functions such as retail analytics , e.g.

c) Predictive analysis - What will happen.

The predictive method allows you to look into the future to answer the question: what will happen? In order to do this, it uses the results of the previously mentioned descriptive, exploratory, and diagnostic analysis, in addition to machine learning (ML) and artificial intelligence (AI). Through this, you can uncover future trends, potential problems or inefficiencies, connections, and casualties in your data.

With predictive analysis, you can unfold and develop initiatives that will not only enhance your various operational processes but also help you gain an all-important edge over the competition. If you understand why a trend, pattern, or event happened through data, you will be able to develop an informed projection of how things may unfold in particular areas of the business.

e) Prescriptive analysis - How will it happen.

Another of the most effective types of analysis methods in research. Prescriptive data techniques cross over from predictive analysis in the way that it revolves around using patterns or trends to develop responsive, practical business strategies.

By drilling down into prescriptive analysis, you will play an active role in the data consumption process by taking well-arranged sets of visual data and using it as a powerful fix to emerging issues in a number of key areas, including marketing, sales, customer experience, HR, fulfillment, finance, logistics analytics , and others.

Top 17 data analysis methods

As mentioned at the beginning of the post, data analysis methods can be divided into two big categories: quantitative and qualitative. Each of these categories holds a powerful analytical value that changes depending on the scenario and type of data you are working with. Below, we will discuss 17 methods that are divided into qualitative and quantitative approaches. 

Without further ado, here are the 17 essential types of data analysis methods with some use cases in the business world: 

A. Quantitative Methods 

To put it simply, quantitative analysis refers to all methods that use numerical data or data that can be turned into numbers (e.g. category variables like gender, age, etc.) to extract valuable insights. It is used to extract valuable conclusions about relationships, differences, and test hypotheses. Below we discuss some of the key quantitative methods. 

1. Cluster analysis

The action of grouping a set of data elements in a way that said elements are more similar (in a particular sense) to each other than to those in other groups – hence the term ‘cluster.’ Since there is no target variable when clustering, the method is often used to find hidden patterns in the data. The approach is also used to provide additional context to a trend or dataset.

Let's look at it from an organizational perspective. In a perfect world, marketers would be able to analyze each customer separately and give them the best-personalized service, but let's face it, with a large customer base, it is timely impossible to do that. That's where clustering comes in. By grouping customers into clusters based on demographics, purchasing behaviors, monetary value, or any other factor that might be relevant for your company, you will be able to immediately optimize your efforts and give your customers the best experience based on their needs.

2. Cohort analysis

This type of data analysis approach uses historical data to examine and compare a determined segment of users' behavior, which can then be grouped with others with similar characteristics. By using this methodology, it's possible to gain a wealth of insight into consumer needs or a firm understanding of a broader target group.

Cohort analysis can be really useful for performing analysis in marketing as it will allow you to understand the impact of your campaigns on specific groups of customers. To exemplify, imagine you send an email campaign encouraging customers to sign up for your site. For this, you create two versions of the campaign with different designs, CTAs, and ad content. Later on, you can use cohort analysis to track the performance of the campaign for a longer period of time and understand which type of content is driving your customers to sign up, repurchase, or engage in other ways.  

A useful tool to start performing cohort analysis method is Google Analytics. You can learn more about the benefits and limitations of using cohorts in GA in this useful guide . In the bottom image, you see an example of how you visualize a cohort in this tool. The segments (devices traffic) are divided into date cohorts (usage of devices) and then analyzed week by week to extract insights into performance.

Cohort analysis chart example from google analytics

3. Regression analysis

Regression uses historical data to understand how a dependent variable's value is affected when one (linear regression) or more independent variables (multiple regression) change or stay the same. By understanding each variable's relationship and how it developed in the past, you can anticipate possible outcomes and make better decisions in the future.

Let's bring it down with an example. Imagine you did a regression analysis of your sales in 2019 and discovered that variables like product quality, store design, customer service, marketing campaigns, and sales channels affected the overall result. Now you want to use regression to analyze which of these variables changed or if any new ones appeared during 2020. For example, you couldn’t sell as much in your physical store due to COVID lockdowns. Therefore, your sales could’ve either dropped in general or increased in your online channels. Through this, you can understand which independent variables affected the overall performance of your dependent variable, annual sales.

If you want to go deeper into this type of analysis, check out this article and learn more about how you can benefit from regression.

4. Neural networks

The neural network forms the basis for the intelligent algorithms of machine learning. It is a form of analytics that attempts, with minimal intervention, to understand how the human brain would generate insights and predict values. Neural networks learn from each and every data transaction, meaning that they evolve and advance over time.

A typical area of application for neural networks is predictive analytics. There are BI reporting tools that have this feature implemented within them, such as the Predictive Analytics Tool from datapine. This tool enables users to quickly and easily generate all kinds of predictions. All you have to do is select the data to be processed based on your KPIs, and the software automatically calculates forecasts based on historical and current data. Thanks to its user-friendly interface, anyone in your organization can manage it; there’s no need to be an advanced scientist. 

Here is an example of how you can use the predictive analysis tool from datapine:

Example on how to use predictive analytics tool from datapine

**click to enlarge**

5. Factor analysis

The factor analysis also called “dimension reduction” is a type of data analysis used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved variables called factors. The aim here is to uncover independent latent variables, an ideal method for streamlining specific segments.

A good way to understand this data analysis method is a customer evaluation of a product. The initial assessment is based on different variables like color, shape, wearability, current trends, materials, comfort, the place where they bought the product, and frequency of usage. Like this, the list can be endless, depending on what you want to track. In this case, factor analysis comes into the picture by summarizing all of these variables into homogenous groups, for example, by grouping the variables color, materials, quality, and trends into a brother latent variable of design.

If you want to start analyzing data using factor analysis we recommend you take a look at this practical guide from UCLA.

6. Data mining

A method of data analysis that is the umbrella term for engineering metrics and insights for additional value, direction, and context. By using exploratory statistical evaluation, data mining aims to identify dependencies, relations, patterns, and trends to generate advanced knowledge.  When considering how to analyze data, adopting a data mining mindset is essential to success - as such, it’s an area that is worth exploring in greater detail.

An excellent use case of data mining is datapine intelligent data alerts . With the help of artificial intelligence and machine learning, they provide automated signals based on particular commands or occurrences within a dataset. For example, if you’re monitoring supply chain KPIs , you could set an intelligent alarm to trigger when invalid or low-quality data appears. By doing so, you will be able to drill down deep into the issue and fix it swiftly and effectively.

In the following picture, you can see how the intelligent alarms from datapine work. By setting up ranges on daily orders, sessions, and revenues, the alarms will notify you if the goal was not completed or if it exceeded expectations.

Example on how to use intelligent alerts from datapine

7. Time series analysis

As its name suggests, time series analysis is used to analyze a set of data points collected over a specified period of time. Although analysts use this method to monitor the data points in a specific interval of time rather than just monitoring them intermittently, the time series analysis is not uniquely used for the purpose of collecting data over time. Instead, it allows researchers to understand if variables changed during the duration of the study, how the different variables are dependent, and how did it reach the end result. 

In a business context, this method is used to understand the causes of different trends and patterns to extract valuable insights. Another way of using this method is with the help of time series forecasting. Powered by predictive technologies, businesses can analyze various data sets over a period of time and forecast different future events. 

A great use case to put time series analysis into perspective is seasonality effects on sales. By using time series forecasting to analyze sales data of a specific product over time, you can understand if sales rise over a specific period of time (e.g. swimwear during summertime, or candy during Halloween). These insights allow you to predict demand and prepare production accordingly.  

8. Decision Trees 

The decision tree analysis aims to act as a support tool to make smart and strategic decisions. By visually displaying potential outcomes, consequences, and costs in a tree-like model, researchers and company users can easily evaluate all factors involved and choose the best course of action. Decision trees are helpful to analyze quantitative data and they allow for an improved decision-making process by helping you spot improvement opportunities, reduce costs, and enhance operational efficiency and production.

But how does a decision tree actually works? This method works like a flowchart that starts with the main decision that you need to make and branches out based on the different outcomes and consequences of each decision. Each outcome will outline its own consequences, costs, and gains and, at the end of the analysis, you can compare each of them and make the smartest decision. 

Businesses can use them to understand which project is more cost-effective and will bring more earnings in the long run. For example, imagine you need to decide if you want to update your software app or build a new app entirely.  Here you would compare the total costs, the time needed to be invested, potential revenue, and any other factor that might affect your decision.  In the end, you would be able to see which of these two options is more realistic and attainable for your company or research.

9. Conjoint analysis 

Last but not least, we have the conjoint analysis. This approach is usually used in surveys to understand how individuals value different attributes of a product or service and it is one of the most effective methods to extract consumer preferences. When it comes to purchasing, some clients might be more price-focused, others more features-focused, and others might have a sustainable focus. Whatever your customer's preferences are, you can find them with conjoint analysis. Through this, companies can define pricing strategies, packaging options, subscription packages, and more. 

A great example of conjoint analysis is in marketing and sales. For instance, a cupcake brand might use conjoint analysis and find that its clients prefer gluten-free options and cupcakes with healthier toppings over super sugary ones. Thus, the cupcake brand can turn these insights into advertisements and promotions to increase sales of this particular type of product. And not just that, conjoint analysis can also help businesses segment their customers based on their interests. This allows them to send different messaging that will bring value to each of the segments. 

10. Correspondence Analysis

Also known as reciprocal averaging, correspondence analysis is a method used to analyze the relationship between categorical variables presented within a contingency table. A contingency table is a table that displays two (simple correspondence analysis) or more (multiple correspondence analysis) categorical variables across rows and columns that show the distribution of the data, which is usually answers to a survey or questionnaire on a specific topic. 

This method starts by calculating an “expected value” which is done by multiplying row and column averages and dividing it by the overall original value of the specific table cell. The “expected value” is then subtracted from the original value resulting in a “residual number” which is what allows you to extract conclusions about relationships and distribution. The results of this analysis are later displayed using a map that represents the relationship between the different values. The closest two values are in the map, the bigger the relationship. Let’s put it into perspective with an example. 

Imagine you are carrying out a market research analysis about outdoor clothing brands and how they are perceived by the public. For this analysis, you ask a group of people to match each brand with a certain attribute which can be durability, innovation, quality materials, etc. When calculating the residual numbers, you can see that brand A has a positive residual for innovation but a negative one for durability. This means that brand A is not positioned as a durable brand in the market, something that competitors could take advantage of. 

11. Multidimensional Scaling (MDS)

MDS is a method used to observe the similarities or disparities between objects which can be colors, brands, people, geographical coordinates, and more. The objects are plotted using an “MDS map” that positions similar objects together and disparate ones far apart. The (dis) similarities between objects are represented using one or more dimensions that can be observed using a numerical scale. For example, if you want to know how people feel about the COVID-19 vaccine, you can use 1 for “don’t believe in the vaccine at all”  and 10 for “firmly believe in the vaccine” and a scale of 2 to 9 for in between responses.  When analyzing an MDS map the only thing that matters is the distance between the objects, the orientation of the dimensions is arbitrary and has no meaning at all. 

Multidimensional scaling is a valuable technique for market research, especially when it comes to evaluating product or brand positioning. For instance, if a cupcake brand wants to know how they are positioned compared to competitors, it can define 2-3 dimensions such as taste, ingredients, shopping experience, or more, and do a multidimensional scaling analysis to find improvement opportunities as well as areas in which competitors are currently leading. 

Another business example is in procurement when deciding on different suppliers. Decision makers can generate an MDS map to see how the different prices, delivery times, technical services, and more of the different suppliers differ and pick the one that suits their needs the best. 

A final example proposed by a research paper on "An Improved Study of Multilevel Semantic Network Visualization for Analyzing Sentiment Word of Movie Review Data". Researchers picked a two-dimensional MDS map to display the distances and relationships between different sentiments in movie reviews. They used 36 sentiment words and distributed them based on their emotional distance as we can see in the image below where the words "outraged" and "sweet" are on opposite sides of the map, marking the distance between the two emotions very clearly.

Example of multidimensional scaling analysis

Aside from being a valuable technique to analyze dissimilarities, MDS also serves as a dimension-reduction technique for large dimensional data. 

B. Qualitative Methods

Qualitative data analysis methods are defined as the observation of non-numerical data that is gathered and produced using methods of observation such as interviews, focus groups, questionnaires, and more. As opposed to quantitative methods, qualitative data is more subjective and highly valuable in analyzing customer retention and product development.

12. Text analysis

Text analysis, also known in the industry as text mining, works by taking large sets of textual data and arranging them in a way that makes it easier to manage. By working through this cleansing process in stringent detail, you will be able to extract the data that is truly relevant to your organization and use it to develop actionable insights that will propel you forward.

Modern software accelerate the application of text analytics. Thanks to the combination of machine learning and intelligent algorithms, you can perform advanced analytical processes such as sentiment analysis. This technique allows you to understand the intentions and emotions of a text, for example, if it's positive, negative, or neutral, and then give it a score depending on certain factors and categories that are relevant to your brand. Sentiment analysis is often used to monitor brand and product reputation and to understand how successful your customer experience is. To learn more about the topic check out this insightful article .

By analyzing data from various word-based sources, including product reviews, articles, social media communications, and survey responses, you will gain invaluable insights into your audience, as well as their needs, preferences, and pain points. This will allow you to create campaigns, services, and communications that meet your prospects’ needs on a personal level, growing your audience while boosting customer retention. There are various other “sub-methods” that are an extension of text analysis. Each of them serves a more specific purpose and we will look at them in detail next. 

13. Content Analysis

This is a straightforward and very popular method that examines the presence and frequency of certain words, concepts, and subjects in different content formats such as text, image, audio, or video. For example, the number of times the name of a celebrity is mentioned on social media or online tabloids. It does this by coding text data that is later categorized and tabulated in a way that can provide valuable insights, making it the perfect mix of quantitative and qualitative analysis.

There are two types of content analysis. The first one is the conceptual analysis which focuses on explicit data, for instance, the number of times a concept or word is mentioned in a piece of content. The second one is relational analysis, which focuses on the relationship between different concepts or words and how they are connected within a specific context. 

Content analysis is often used by marketers to measure brand reputation and customer behavior. For example, by analyzing customer reviews. It can also be used to analyze customer interviews and find directions for new product development. It is also important to note, that in order to extract the maximum potential out of this analysis method, it is necessary to have a clearly defined research question. 

14. Thematic Analysis

Very similar to content analysis, thematic analysis also helps in identifying and interpreting patterns in qualitative data with the main difference being that the first one can also be applied to quantitative analysis. The thematic method analyzes large pieces of text data such as focus group transcripts or interviews and groups them into themes or categories that come up frequently within the text. It is a great method when trying to figure out peoples view’s and opinions about a certain topic. For example, if you are a brand that cares about sustainability, you can do a survey of your customers to analyze their views and opinions about sustainability and how they apply it to their lives. You can also analyze customer service calls transcripts to find common issues and improve your service. 

Thematic analysis is a very subjective technique that relies on the researcher’s judgment. Therefore,  to avoid biases, it has 6 steps that include familiarization, coding, generating themes, reviewing themes, defining and naming themes, and writing up. It is also important to note that, because it is a flexible approach, the data can be interpreted in multiple ways and it can be hard to select what data is more important to emphasize. 

15. Narrative Analysis 

A bit more complex in nature than the two previous ones, narrative analysis is used to explore the meaning behind the stories that people tell and most importantly, how they tell them. By looking into the words that people use to describe a situation you can extract valuable conclusions about their perspective on a specific topic. Common sources for narrative data include autobiographies, family stories, opinion pieces, and testimonials, among others. 

From a business perspective, narrative analysis can be useful to analyze customer behaviors and feelings towards a specific product, service, feature, or others. It provides unique and deep insights that can be extremely valuable. However, it has some drawbacks.  

The biggest weakness of this method is that the sample sizes are usually very small due to the complexity and time-consuming nature of the collection of narrative data. Plus, the way a subject tells a story will be significantly influenced by his or her specific experiences, making it very hard to replicate in a subsequent study. 

16. Discourse Analysis

Discourse analysis is used to understand the meaning behind any type of written, verbal, or symbolic discourse based on its political, social, or cultural context. It mixes the analysis of languages and situations together. This means that the way the content is constructed and the meaning behind it is significantly influenced by the culture and society it takes place in. For example, if you are analyzing political speeches you need to consider different context elements such as the politician's background, the current political context of the country, the audience to which the speech is directed, and so on. 

From a business point of view, discourse analysis is a great market research tool. It allows marketers to understand how the norms and ideas of the specific market work and how their customers relate to those ideas. It can be very useful to build a brand mission or develop a unique tone of voice. 

17. Grounded Theory Analysis

Traditionally, researchers decide on a method and hypothesis and start to collect the data to prove that hypothesis. The grounded theory is the only method that doesn’t require an initial research question or hypothesis as its value lies in the generation of new theories. With the grounded theory method, you can go into the analysis process with an open mind and explore the data to generate new theories through tests and revisions. In fact, it is not necessary to collect the data and then start to analyze it. Researchers usually start to find valuable insights as they are gathering the data. 

All of these elements make grounded theory a very valuable method as theories are fully backed by data instead of initial assumptions. It is a great technique to analyze poorly researched topics or find the causes behind specific company outcomes. For example, product managers and marketers might use the grounded theory to find the causes of high levels of customer churn and look into customer surveys and reviews to develop new theories about the causes. 

How To Analyze Data? Top 17 Data Analysis Techniques To Apply

17 top data analysis techniques by datapine

Now that we’ve answered the questions “what is data analysis’”, why is it important, and covered the different data analysis types, it’s time to dig deeper into how to perform your analysis by working through these 17 essential techniques.

1. Collaborate your needs

Before you begin analyzing or drilling down into any techniques, it’s crucial to sit down collaboratively with all key stakeholders within your organization, decide on your primary campaign or strategic goals, and gain a fundamental understanding of the types of insights that will best benefit your progress or provide you with the level of vision you need to evolve your organization.

2. Establish your questions

Once you’ve outlined your core objectives, you should consider which questions will need answering to help you achieve your mission. This is one of the most important techniques as it will shape the very foundations of your success.

To help you ask the right things and ensure your data works for you, you have to ask the right data analysis questions .

3. Data democratization

After giving your data analytics methodology some real direction, and knowing which questions need answering to extract optimum value from the information available to your organization, you should continue with democratization.

Data democratization is an action that aims to connect data from various sources efficiently and quickly so that anyone in your organization can access it at any given moment. You can extract data in text, images, videos, numbers, or any other format. And then perform cross-database analysis to achieve more advanced insights to share with the rest of the company interactively.  

Once you have decided on your most valuable sources, you need to take all of this into a structured format to start collecting your insights. For this purpose, datapine offers an easy all-in-one data connectors feature to integrate all your internal and external sources and manage them at your will. Additionally, datapine’s end-to-end solution automatically updates your data, allowing you to save time and focus on performing the right analysis to grow your company.

data connectors from datapine

4. Think of governance 

When collecting data in a business or research context you always need to think about security and privacy. With data breaches becoming a topic of concern for businesses, the need to protect your client's or subject’s sensitive information becomes critical. 

To ensure that all this is taken care of, you need to think of a data governance strategy. According to Gartner , this concept refers to “ the specification of decision rights and an accountability framework to ensure the appropriate behavior in the valuation, creation, consumption, and control of data and analytics .” In simpler words, data governance is a collection of processes, roles, and policies, that ensure the efficient use of data while still achieving the main company goals. It ensures that clear roles are in place for who can access the information and how they can access it. In time, this not only ensures that sensitive information is protected but also allows for an efficient analysis as a whole. 

5. Clean your data

After harvesting from so many sources you will be left with a vast amount of information that can be overwhelming to deal with. At the same time, you can be faced with incorrect data that can be misleading to your analysis. The smartest thing you can do to avoid dealing with this in the future is to clean the data. This is fundamental before visualizing it, as it will ensure that the insights you extract from it are correct.

There are many things that you need to look for in the cleaning process. The most important one is to eliminate any duplicate observations; this usually appears when using multiple internal and external sources of information. You can also add any missing codes, fix empty fields, and eliminate incorrectly formatted data.

Another usual form of cleaning is done with text data. As we mentioned earlier, most companies today analyze customer reviews, social media comments, questionnaires, and several other text inputs. In order for algorithms to detect patterns, text data needs to be revised to avoid invalid characters or any syntax or spelling errors. 

Most importantly, the aim of cleaning is to prevent you from arriving at false conclusions that can damage your company in the long run. By using clean data, you will also help BI solutions to interact better with your information and create better reports for your organization.

6. Set your KPIs

Once you’ve set your sources, cleaned your data, and established clear-cut questions you want your insights to answer, you need to set a host of key performance indicators (KPIs) that will help you track, measure, and shape your progress in a number of key areas.

KPIs are critical to both qualitative and quantitative analysis research. This is one of the primary methods of data analysis you certainly shouldn’t overlook.

To help you set the best possible KPIs for your initiatives and activities, here is an example of a relevant logistics KPI : transportation-related costs. If you want to see more go explore our collection of key performance indicator examples .

Transportation costs logistics KPIs

7. Omit useless data

Having bestowed your data analysis tools and techniques with true purpose and defined your mission, you should explore the raw data you’ve collected from all sources and use your KPIs as a reference for chopping out any information you deem to be useless.

Trimming the informational fat is one of the most crucial methods of analysis as it will allow you to focus your analytical efforts and squeeze every drop of value from the remaining ‘lean’ information.

Any stats, facts, figures, or metrics that don’t align with your business goals or fit with your KPI management strategies should be eliminated from the equation.

8. Build a data management roadmap

While, at this point, this particular step is optional (you will have already gained a wealth of insight and formed a fairly sound strategy by now), creating a data governance roadmap will help your data analysis methods and techniques become successful on a more sustainable basis. These roadmaps, if developed properly, are also built so they can be tweaked and scaled over time.

Invest ample time in developing a roadmap that will help you store, manage, and handle your data internally, and you will make your analysis techniques all the more fluid and functional – one of the most powerful types of data analysis methods available today.

9. Integrate technology

There are many ways to analyze data, but one of the most vital aspects of analytical success in a business context is integrating the right decision support software and technology.

Robust analysis platforms will not only allow you to pull critical data from your most valuable sources while working with dynamic KPIs that will offer you actionable insights; it will also present them in a digestible, visual, interactive format from one central, live dashboard . A data methodology you can count on.

By integrating the right technology within your data analysis methodology, you’ll avoid fragmenting your insights, saving you time and effort while allowing you to enjoy the maximum value from your business’s most valuable insights.

For a look at the power of software for the purpose of analysis and to enhance your methods of analyzing, glance over our selection of dashboard examples .

10. Answer your questions

By considering each of the above efforts, working with the right technology, and fostering a cohesive internal culture where everyone buys into the different ways to analyze data as well as the power of digital intelligence, you will swiftly start to answer your most burning business questions. Arguably, the best way to make your data concepts accessible across the organization is through data visualization.

11. Visualize your data

Online data visualization is a powerful tool as it lets you tell a story with your metrics, allowing users across the organization to extract meaningful insights that aid business evolution – and it covers all the different ways to analyze data.

The purpose of analyzing is to make your entire organization more informed and intelligent, and with the right platform or dashboard, this is simpler than you think, as demonstrated by our marketing dashboard .

An executive dashboard example showcasing high-level marketing KPIs such as cost per lead, MQL, SQL, and cost per customer.

This visual, dynamic, and interactive online dashboard is a data analysis example designed to give Chief Marketing Officers (CMO) an overview of relevant metrics to help them understand if they achieved their monthly goals.

In detail, this example generated with a modern dashboard creator displays interactive charts for monthly revenues, costs, net income, and net income per customer; all of them are compared with the previous month so that you can understand how the data fluctuated. In addition, it shows a detailed summary of the number of users, customers, SQLs, and MQLs per month to visualize the whole picture and extract relevant insights or trends for your marketing reports .

The CMO dashboard is perfect for c-level management as it can help them monitor the strategic outcome of their marketing efforts and make data-driven decisions that can benefit the company exponentially.

12. Be careful with the interpretation

We already dedicated an entire post to data interpretation as it is a fundamental part of the process of data analysis. It gives meaning to the analytical information and aims to drive a concise conclusion from the analysis results. Since most of the time companies are dealing with data from many different sources, the interpretation stage needs to be done carefully and properly in order to avoid misinterpretations. 

To help you through the process, here we list three common practices that you need to avoid at all costs when looking at your data:

  • Correlation vs. causation: The human brain is formatted to find patterns. This behavior leads to one of the most common mistakes when performing interpretation: confusing correlation with causation. Although these two aspects can exist simultaneously, it is not correct to assume that because two things happened together, one provoked the other. A piece of advice to avoid falling into this mistake is never to trust just intuition, trust the data. If there is no objective evidence of causation, then always stick to correlation. 
  • Confirmation bias: This phenomenon describes the tendency to select and interpret only the data necessary to prove one hypothesis, often ignoring the elements that might disprove it. Even if it's not done on purpose, confirmation bias can represent a real problem, as excluding relevant information can lead to false conclusions and, therefore, bad business decisions. To avoid it, always try to disprove your hypothesis instead of proving it, share your analysis with other team members, and avoid drawing any conclusions before the entire analytical project is finalized.
  • Statistical significance: To put it in short words, statistical significance helps analysts understand if a result is actually accurate or if it happened because of a sampling error or pure chance. The level of statistical significance needed might depend on the sample size and the industry being analyzed. In any case, ignoring the significance of a result when it might influence decision-making can be a huge mistake.

13. Build a narrative

Now, we’re going to look at how you can bring all of these elements together in a way that will benefit your business - starting with a little something called data storytelling.

The human brain responds incredibly well to strong stories or narratives. Once you’ve cleansed, shaped, and visualized your most invaluable data using various BI dashboard tools , you should strive to tell a story - one with a clear-cut beginning, middle, and end.

By doing so, you will make your analytical efforts more accessible, digestible, and universal, empowering more people within your organization to use your discoveries to their actionable advantage.

14. Consider autonomous technology

Autonomous technologies, such as artificial intelligence (AI) and machine learning (ML), play a significant role in the advancement of understanding how to analyze data more effectively.

Gartner predicts that by the end of this year, 80% of emerging technologies will be developed with AI foundations. This is a testament to the ever-growing power and value of autonomous technologies.

At the moment, these technologies are revolutionizing the analysis industry. Some examples that we mentioned earlier are neural networks, intelligent alarms, and sentiment analysis.

15. Share the load

If you work with the right tools and dashboards, you will be able to present your metrics in a digestible, value-driven format, allowing almost everyone in the organization to connect with and use relevant data to their advantage.

Modern dashboards consolidate data from various sources, providing access to a wealth of insights in one centralized location, no matter if you need to monitor recruitment metrics or generate reports that need to be sent across numerous departments. Moreover, these cutting-edge tools offer access to dashboards from a multitude of devices, meaning that everyone within the business can connect with practical insights remotely - and share the load.

Once everyone is able to work with a data-driven mindset, you will catalyze the success of your business in ways you never thought possible. And when it comes to knowing how to analyze data, this kind of collaborative approach is essential.

16. Data analysis tools

In order to perform high-quality analysis of data, it is fundamental to use tools and software that will ensure the best results. Here we leave you a small summary of four fundamental categories of data analysis tools for your organization.

  • Business Intelligence: BI tools allow you to process significant amounts of data from several sources in any format. Through this, you can not only analyze and monitor your data to extract relevant insights but also create interactive reports and dashboards to visualize your KPIs and use them for your company's good. datapine is an amazing online BI software that is focused on delivering powerful online analysis features that are accessible to beginner and advanced users. Like this, it offers a full-service solution that includes cutting-edge analysis of data, KPIs visualization, live dashboards, reporting, and artificial intelligence technologies to predict trends and minimize risk.
  • Statistical analysis: These tools are usually designed for scientists, statisticians, market researchers, and mathematicians, as they allow them to perform complex statistical analyses with methods like regression analysis, predictive analysis, and statistical modeling. A good tool to perform this type of analysis is R-Studio as it offers a powerful data modeling and hypothesis testing feature that can cover both academic and general data analysis. This tool is one of the favorite ones in the industry, due to its capability for data cleaning, data reduction, and performing advanced analysis with several statistical methods. Another relevant tool to mention is SPSS from IBM. The software offers advanced statistical analysis for users of all skill levels. Thanks to a vast library of machine learning algorithms, text analysis, and a hypothesis testing approach it can help your company find relevant insights to drive better decisions. SPSS also works as a cloud service that enables you to run it anywhere.
  • SQL Consoles: SQL is a programming language often used to handle structured data in relational databases. Tools like these are popular among data scientists as they are extremely effective in unlocking these databases' value. Undoubtedly, one of the most used SQL software in the market is MySQL Workbench . This tool offers several features such as a visual tool for database modeling and monitoring, complete SQL optimization, administration tools, and visual performance dashboards to keep track of KPIs.
  • Data Visualization: These tools are used to represent your data through charts, graphs, and maps that allow you to find patterns and trends in the data. datapine's already mentioned BI platform also offers a wealth of powerful online data visualization tools with several benefits. Some of them include: delivering compelling data-driven presentations to share with your entire company, the ability to see your data online with any device wherever you are, an interactive dashboard design feature that enables you to showcase your results in an interactive and understandable way, and to perform online self-service reports that can be used simultaneously with several other people to enhance team productivity.

17. Refine your process constantly 

Last is a step that might seem obvious to some people, but it can be easily ignored if you think you are done. Once you have extracted the needed results, you should always take a retrospective look at your project and think about what you can improve. As you saw throughout this long list of techniques, data analysis is a complex process that requires constant refinement. For this reason, you should always go one step further and keep improving. 

Quality Criteria For Data Analysis

So far we’ve covered a list of methods and techniques that should help you perform efficient data analysis. But how do you measure the quality and validity of your results? This is done with the help of some science quality criteria. Here we will go into a more theoretical area that is critical to understanding the fundamentals of statistical analysis in science. However, you should also be aware of these steps in a business context, as they will allow you to assess the quality of your results in the correct way. Let’s dig in. 

  • Internal validity: The results of a survey are internally valid if they measure what they are supposed to measure and thus provide credible results. In other words , internal validity measures the trustworthiness of the results and how they can be affected by factors such as the research design, operational definitions, how the variables are measured, and more. For instance, imagine you are doing an interview to ask people if they brush their teeth two times a day. While most of them will answer yes, you can still notice that their answers correspond to what is socially acceptable, which is to brush your teeth at least twice a day. In this case, you can’t be 100% sure if respondents actually brush their teeth twice a day or if they just say that they do, therefore, the internal validity of this interview is very low. 
  • External validity: Essentially, external validity refers to the extent to which the results of your research can be applied to a broader context. It basically aims to prove that the findings of a study can be applied in the real world. If the research can be applied to other settings, individuals, and times, then the external validity is high. 
  • Reliability : If your research is reliable, it means that it can be reproduced. If your measurement were repeated under the same conditions, it would produce similar results. This means that your measuring instrument consistently produces reliable results. For example, imagine a doctor building a symptoms questionnaire to detect a specific disease in a patient. Then, various other doctors use this questionnaire but end up diagnosing the same patient with a different condition. This means the questionnaire is not reliable in detecting the initial disease. Another important note here is that in order for your research to be reliable, it also needs to be objective. If the results of a study are the same, independent of who assesses them or interprets them, the study can be considered reliable. Let’s see the objectivity criteria in more detail now. 
  • Objectivity: In data science, objectivity means that the researcher needs to stay fully objective when it comes to its analysis. The results of a study need to be affected by objective criteria and not by the beliefs, personality, or values of the researcher. Objectivity needs to be ensured when you are gathering the data, for example, when interviewing individuals, the questions need to be asked in a way that doesn't influence the results. Paired with this, objectivity also needs to be thought of when interpreting the data. If different researchers reach the same conclusions, then the study is objective. For this last point, you can set predefined criteria to interpret the results to ensure all researchers follow the same steps. 

The discussed quality criteria cover mostly potential influences in a quantitative context. Analysis in qualitative research has by default additional subjective influences that must be controlled in a different way. Therefore, there are other quality criteria for this kind of research such as credibility, transferability, dependability, and confirmability. You can see each of them more in detail on this resource . 

Data Analysis Limitations & Barriers

Analyzing data is not an easy task. As you’ve seen throughout this post, there are many steps and techniques that you need to apply in order to extract useful information from your research. While a well-performed analysis can bring various benefits to your organization it doesn't come without limitations. In this section, we will discuss some of the main barriers you might encounter when conducting an analysis. Let’s see them more in detail. 

  • Lack of clear goals: No matter how good your data or analysis might be if you don’t have clear goals or a hypothesis the process might be worthless. While we mentioned some methods that don’t require a predefined hypothesis, it is always better to enter the analytical process with some clear guidelines of what you are expecting to get out of it, especially in a business context in which data is utilized to support important strategic decisions. 
  • Objectivity: Arguably one of the biggest barriers when it comes to data analysis in research is to stay objective. When trying to prove a hypothesis, researchers might find themselves, intentionally or unintentionally, directing the results toward an outcome that they want. To avoid this, always question your assumptions and avoid confusing facts with opinions. You can also show your findings to a research partner or external person to confirm that your results are objective. 
  • Data representation: A fundamental part of the analytical procedure is the way you represent your data. You can use various graphs and charts to represent your findings, but not all of them will work for all purposes. Choosing the wrong visual can not only damage your analysis but can mislead your audience, therefore, it is important to understand when to use each type of data depending on your analytical goals. Our complete guide on the types of graphs and charts lists 20 different visuals with examples of when to use them. 
  • Flawed correlation : Misleading statistics can significantly damage your research. We’ve already pointed out a few interpretation issues previously in the post, but it is an important barrier that we can't avoid addressing here as well. Flawed correlations occur when two variables appear related to each other but they are not. Confusing correlations with causation can lead to a wrong interpretation of results which can lead to building wrong strategies and loss of resources, therefore, it is very important to identify the different interpretation mistakes and avoid them. 
  • Sample size: A very common barrier to a reliable and efficient analysis process is the sample size. In order for the results to be trustworthy, the sample size should be representative of what you are analyzing. For example, imagine you have a company of 1000 employees and you ask the question “do you like working here?” to 50 employees of which 49 say yes, which means 95%. Now, imagine you ask the same question to the 1000 employees and 950 say yes, which also means 95%. Saying that 95% of employees like working in the company when the sample size was only 50 is not a representative or trustworthy conclusion. The significance of the results is way more accurate when surveying a bigger sample size.   
  • Privacy concerns: In some cases, data collection can be subjected to privacy regulations. Businesses gather all kinds of information from their customers from purchasing behaviors to addresses and phone numbers. If this falls into the wrong hands due to a breach, it can affect the security and confidentiality of your clients. To avoid this issue, you need to collect only the data that is needed for your research and, if you are using sensitive facts, make it anonymous so customers are protected. The misuse of customer data can severely damage a business's reputation, so it is important to keep an eye on privacy. 
  • Lack of communication between teams : When it comes to performing data analysis on a business level, it is very likely that each department and team will have different goals and strategies. However, they are all working for the same common goal of helping the business run smoothly and keep growing. When teams are not connected and communicating with each other, it can directly affect the way general strategies are built. To avoid these issues, tools such as data dashboards enable teams to stay connected through data in a visually appealing way. 
  • Innumeracy : Businesses are working with data more and more every day. While there are many BI tools available to perform effective analysis, data literacy is still a constant barrier. Not all employees know how to apply analysis techniques or extract insights from them. To prevent this from happening, you can implement different training opportunities that will prepare every relevant user to deal with data. 

Key Data Analysis Skills

As you've learned throughout this lengthy guide, analyzing data is a complex task that requires a lot of knowledge and skills. That said, thanks to the rise of self-service tools the process is way more accessible and agile than it once was. Regardless, there are still some key skills that are valuable to have when working with data, we list the most important ones below.

  • Critical and statistical thinking: To successfully analyze data you need to be creative and think out of the box. Yes, that might sound like a weird statement considering that data is often tight to facts. However, a great level of critical thinking is required to uncover connections, come up with a valuable hypothesis, and extract conclusions that go a step further from the surface. This, of course, needs to be complemented by statistical thinking and an understanding of numbers. 
  • Data cleaning: Anyone who has ever worked with data before will tell you that the cleaning and preparation process accounts for 80% of a data analyst's work, therefore, the skill is fundamental. But not just that, not cleaning the data adequately can also significantly damage the analysis which can lead to poor decision-making in a business scenario. While there are multiple tools that automate the cleaning process and eliminate the possibility of human error, it is still a valuable skill to dominate. 
  • Data visualization: Visuals make the information easier to understand and analyze, not only for professional users but especially for non-technical ones. Having the necessary skills to not only choose the right chart type but know when to apply it correctly is key. This also means being able to design visually compelling charts that make the data exploration process more efficient. 
  • SQL: The Structured Query Language or SQL is a programming language used to communicate with databases. It is fundamental knowledge as it enables you to update, manipulate, and organize data from relational databases which are the most common databases used by companies. It is fairly easy to learn and one of the most valuable skills when it comes to data analysis. 
  • Communication skills: This is a skill that is especially valuable in a business environment. Being able to clearly communicate analytical outcomes to colleagues is incredibly important, especially when the information you are trying to convey is complex for non-technical people. This applies to in-person communication as well as written format, for example, when generating a dashboard or report. While this might be considered a “soft” skill compared to the other ones we mentioned, it should not be ignored as you most likely will need to share analytical findings with others no matter the context. 

Data Analysis In The Big Data Environment

Big data is invaluable to today’s businesses, and by using different methods for data analysis, it’s possible to view your data in a way that can help you turn insight into positive action.

To inspire your efforts and put the importance of big data into context, here are some insights that you should know:

  • By 2026 the industry of big data is expected to be worth approximately $273.4 billion.
  • 94% of enterprises say that analyzing data is important for their growth and digital transformation. 
  • Companies that exploit the full potential of their data can increase their operating margins by 60% .
  • We already told you the benefits of Artificial Intelligence through this article. This industry's financial impact is expected to grow up to $40 billion by 2025.

Data analysis concepts may come in many forms, but fundamentally, any solid methodology will help to make your business more streamlined, cohesive, insightful, and successful than ever before.

Key Takeaways From Data Analysis 

As we reach the end of our data analysis journey, we leave a small summary of the main methods and techniques to perform excellent analysis and grow your business.

17 Essential Types of Data Analysis Methods:

  • Cluster analysis
  • Cohort analysis
  • Regression analysis
  • Factor analysis
  • Neural Networks
  • Data Mining
  • Text analysis
  • Time series analysis
  • Decision trees
  • Conjoint analysis 
  • Correspondence Analysis
  • Multidimensional Scaling 
  • Content analysis 
  • Thematic analysis
  • Narrative analysis 
  • Grounded theory analysis
  • Discourse analysis 

Top 17 Data Analysis Techniques:

  • Collaborate your needs
  • Establish your questions
  • Data democratization
  • Think of data governance 
  • Clean your data
  • Set your KPIs
  • Omit useless data
  • Build a data management roadmap
  • Integrate technology
  • Answer your questions
  • Visualize your data
  • Interpretation of data
  • Consider autonomous technology
  • Build a narrative
  • Share the load
  • Data Analysis tools
  • Refine your process constantly 

We’ve pondered the data analysis definition and drilled down into the practical applications of data-centric analytics, and one thing is clear: by taking measures to arrange your data and making your metrics work for you, it’s possible to transform raw information into action - the kind of that will push your business to the next level.

Yes, good data analytics techniques result in enhanced business intelligence (BI). To help you understand this notion in more detail, read our exploration of business intelligence reporting .

And, if you’re ready to perform your own analysis, drill down into your facts and figures while interacting with your data on astonishing visuals, you can try our software for a free, 14-day trial .


Regression Analysis

Regression analysis is a quantitative research method which is used when the study involves modelling and analysing several variables, where the relationship includes a dependent variable and one or more independent variables. In simple terms, regression analysis is a quantitative method used to test the nature of relationships between a dependent variable and one or more independent variables.

The basic form of regression models includes unknown parameters (β), independent variables (X), and the dependent variable (Y).

Regression model, basically, specifies the relation of dependent variable (Y) to a function combination of independent variables (X) and unknown parameters (β)

                                    Y  ≈  f (X, β)   

Regression equation can be used to predict the values of ‘y’, if the value of ‘x’ is given, and both ‘y’ and ‘x’ are the two sets of measures of a sample size of ‘n’. The formulae for regression equation would be

Regression analysis

Do not be intimidated by visual complexity of correlation and regression formulae above. You don’t have to apply the formula manually, and correlation and regression analyses can be run with the application of popular analytical software such as Microsoft Excel, Microsoft Access, SPSS and others.

Linear regression analysis is based on the following set of assumptions:

1. Assumption of linearity . There is a linear relationship between dependent and independent variables.

2. Assumption of homoscedasticity . Data values for dependent and independent variables have equal variances.

3. Assumption of absence of collinearity or multicollinearity . There is no correlation between two or more independent variables.

4. Assumption of normal distribution . The data for the independent variables and dependent variable are normally distributed

My e-book,  The Ultimate Guide to Writing a Dissertation in Business Studies: a step by step assistance  offers practical assistance to complete a dissertation with minimum or no stress. The e-book covers all stages of writing a dissertation starting from the selection to the research area to submitting the completed version of the work within the deadline. John Dudovskiy

Regression analysis


New Content From Advances in Methods and Practices in Psychological Science

  • Advances in Methods and Practices in Psychological Science
  • Cognitive Dissonance
  • Meta-Analysis
  • Methodology
  • Preregistration
  • Reproducibility

analysis in research methodology

A Practical Guide to Conversation Research: How to Study What People Say to Each Other Michael Yeomans, F. Katelynn Boland, Hanne Collins, Nicole Abi-Esber, and Alison Wood Brooks  

Conversation—a verbal interaction between two or more people—is a complex, pervasive, and consequential human behavior. Conversations have been studied across many academic disciplines. However, advances in recording and analysis techniques over the last decade have allowed researchers to more directly and precisely examine conversations in natural contexts and at a larger scale than ever before, and these advances open new paths to understand humanity and the social world. Existing reviews of text analysis and conversation research have focused on text generated by a single author (e.g., product reviews, news articles, and public speeches) and thus leave open questions about the unique challenges presented by interactive conversation data (i.e., dialogue). In this article, we suggest approaches to overcome common challenges in the workflow of conversation science, including recording and transcribing conversations, structuring data (to merge turn-level and speaker-level data sets), extracting and aggregating linguistic features, estimating effects, and sharing data. This practical guide is meant to shed light on current best practices and empower more researchers to study conversations more directly—to expand the community of conversation scholars and contribute to a greater cumulative scientific understanding of the social world. 

Open-Science Guidance for Qualitative Research: An Empirically Validated Approach for De-Identifying Sensitive Narrative Data Rebecca Campbell, McKenzie Javorka, Jasmine Engleton, Kathryn Fishwick, Katie Gregory, and Rachael Goodman-Williams  

The open-science movement seeks to make research more transparent and accessible. To that end, researchers are increasingly expected to share de-identified data with other scholars for review, reanalysis, and reuse. In psychology, open-science practices have been explored primarily within the context of quantitative data, but demands to share qualitative data are becoming more prevalent. Narrative data are far more challenging to de-identify fully, and because qualitative methods are often used in studies with marginalized, minoritized, and/or traumatized populations, data sharing may pose substantial risks for participants if their information can be later reidentified. To date, there has been little guidance in the literature on how to de-identify qualitative data. To address this gap, we developed a methodological framework for remediating sensitive narrative data. This multiphase process is modeled on common qualitative-coding strategies. The first phase includes consultations with diverse stakeholders and sources to understand reidentifiability risks and data-sharing concerns. The second phase outlines an iterative process for recognizing potentially identifiable information and constructing individualized remediation strategies through group review and consensus. The third phase includes multiple strategies for assessing the validity of the de-identification analyses (i.e., whether the remediated transcripts adequately protect participants’ privacy). We applied this framework to a set of 32 qualitative interviews with sexual-assault survivors. We provide case examples of how blurring and redaction techniques can be used to protect names, dates, locations, trauma histories, help-seeking experiences, and other information about dyadic interactions. 

Impossible Hypotheses and Effect-Size Limits Wijnand van Tilburg and Lennert van Tilburg

Psychological science is moving toward further specification of effect sizes when formulating hypotheses, performing power analyses, and considering the relevance of findings. This development has sparked an appreciation for the wider context in which such effect sizes are found because the importance assigned to specific sizes may vary from situation to situation. We add to this development a crucial but in psychology hitherto underappreciated contingency: There are mathematical limits to the magnitudes that population effect sizes can take within the common multivariate context in which psychology is situated, and these limits can be far more restrictive than typically assumed. The implication is that some hypothesized or preregistered effect sizes may be impossible. At the same time, these restrictions offer a way of statistically triangulating the plausible range of unknown effect sizes. We explain the reason for the existence of these limits, illustrate how to identify them, and offer recommendations and tools for improving hypothesized effect sizes by exploiting the broader multivariate context in which they occur. 

analysis in research methodology

It’s All About Timing: Exploring Different Temporal Resolutions for Analyzing Digital-Phenotyping Data Anna Langener, Gert Stulp, Nicholas Jacobson, Andrea Costanzo, Raj Jagesar, Martien Kas, and Laura Bringmann  

The use of smartphones and wearable sensors to passively collect data on behavior has great potential for better understanding psychological well-being and mental disorders with minimal burden. However, there are important methodological challenges that may hinder the widespread adoption of these passive measures. A crucial one is the issue of timescale: The chosen temporal resolution for summarizing and analyzing the data may affect how results are interpreted. Despite its importance, the choice of temporal resolution is rarely justified. In this study, we aim to improve current standards for analyzing digital-phenotyping data by addressing the time-related decisions faced by researchers. For illustrative purposes, we use data from 10 students whose behavior (e.g., GPS, app usage) was recorded for 28 days through the Behapp application on their mobile phones. In parallel, the participants actively answered questionnaires on their phones about their mood several times a day. We provide a walk-through on how to study different timescales by doing individualized correlation analyses and random-forest prediction models. By doing so, we demonstrate how choosing different resolutions can lead to different conclusions. Therefore, we propose conducting a multiverse analysis to investigate the consequences of choosing different temporal resolutions. This will improve current standards for analyzing digital-phenotyping data and may help combat the replications crisis caused in part by researchers making implicit decisions. 

Calculating Repeated-Measures Meta-Analytic Effects for Continuous Outcomes: A Tutorial on Pretest–Posttest-Controlled Designs David R. Skvarc, Matthew Fuller-Tyszkiewicz  

Meta-analysis is a statistical technique that combines the results of multiple studies to arrive at a more robust and reliable estimate of an overall effect or estimate of the true effect. Within the context of experimental study designs, standard meta-analyses generally use between-groups differences at a single time point. This approach fails to adequately account for preexisting differences that are likely to threaten causal inference. Meta-analyses that take into account the repeated-measures nature of these data are uncommon, and so this article serves as an instructive methodology for increasing the precision of meta-analyses by attempting to estimate the repeated-measures effect sizes, with particular focus on contexts with two time points and two groups (a between-groups pretest–posttest design)—a common scenario for clinical trials and experiments. In this article, we summarize the concept of a between-groups pretest–posttest meta-analysis and its applications. We then explain the basic steps involved in conducting this meta-analysis, including the extraction of data and several alternative approaches for the calculation of effect sizes. We also highlight the importance of considering the presence of within-subjects correlations when conducting this form of meta-analysis.   

Reliability and Feasibility of Linear Mixed Models in Fully Crossed Experimental Designs Michele Scandola, Emmanuele Tidoni  

The use of linear mixed models (LMMs) is increasing in psychology and neuroscience research In this article, we focus on the implementation of LMMs in fully crossed experimental designs. A key aspect of LMMs is choosing a random-effects structure according to the experimental needs. To date, opposite suggestions are present in the literature, spanning from keeping all random effects (maximal models), which produces several singularity and convergence issues, to removing random effects until the best fit is found, with the risk of inflating Type I error (reduced models). However, defining the random structure to fit a nonsingular and convergent model is not straightforward. Moreover, the lack of a standard approach may lead the researcher to make decisions that potentially inflate Type I errors. After reviewing LMMs, we introduce a step-by-step approach to avoid convergence and singularity issues and control for Type I error inflation during model reduction of fully crossed experimental designs. Specifically, we propose the use of complex random intercepts (CRIs) when maximal models are overparametrized. CRIs are multiple random intercepts that represent the residual variance of categorical fixed effects within a given grouping factor. We validated CRIs and the proposed procedure by extensive simulations and a real-case application. We demonstrate that CRIs can produce reliable results and require less computational resources. Moreover, we outline a few criteria and recommendations on how and when scholars should reduce overparametrized models. Overall, the proposed procedure provides clear solutions to avoid overinflated results using LMMs in psychology and neuroscience.   

Understanding Meta-Analysis Through Data Simulation With Applications to Power Analysis Filippo Gambarota, Gianmarco Altoè  

Meta-analysis is a powerful tool to combine evidence from existing literature. Despite several introductory and advanced materials about organizing, conducting, and reporting a meta-analysis, to our knowledge, there are no introductive materials about simulating the most common meta-analysis models. Data simulation is essential for developing and validating new statistical models and procedures. Furthermore, data simulation is a powerful educational tool for understanding a statistical method. In this tutorial, we show how to simulate equal-effects, random-effects, and metaregression models and illustrate how to estimate statistical power. Simulations for multilevel and multivariate models are available in the Supplemental Material available online. All materials associated with this article can be accessed on OSF ( https://osf.io/54djn/ ).   

Feedback on this article? Email  [email protected]  or login to comment.

APS regularly opens certain online articles for discussion on our website. Effective February 2021, you must be a logged-in APS member to post comments. By posting a comment, you agree to our Community Guidelines and the display of your profile information, including your name and affiliation. Any opinions, findings, conclusions, or recommendations present in article comments are those of the writers and do not necessarily reflect the views of APS or the article’s author. For more information, please see our Community Guidelines .

Please login with your APS account to comment.

Privacy Overview

  • Skip to main content
  • Skip to FDA Search
  • Skip to in this section menu
  • Skip to footer links

U.S. flag

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you're on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

U.S. Food and Drug Administration

  •   Search
  •   Menu
  • Science & Research
  • Science and Research Special Topics
  • Advancing Regulatory Science

Principal stratification methods and software for intercurrent events in clinical trials

CERSI Collaborators: Triangle CERSI, Duke University: Fan Li, PhD; Laine Thomas, PhD; Anqi Zhao, PhD; Susan Halabi, PhD

FDA Collaborators: Yuan-Li Shen, Dr.P.H; Pallavi Mishra-Kalyani, PhD; Shu Wang, PhD.; Xiaoxue Li, PhD.; Joyce Cheng, PhD

CERSI Subcontractors: Flying Buttress Associates- Jeph Herrin, PhD

CERSI In-Kind Collaborators: OptumLabs - William Crown, PhD; University of San Francisco - Sanket Dhruva, MD

Non-Federal Entity Collaborators: Johnson and Johnson- Karla Childers, MSJ, Paul Coplan, ScD, MBA, Stephen Johnston, MSc

Project Start Date: September 8, 2023

Regulatory Science Challenge

Events that occur post randomization in randomized control trials, known as intercurrent events, can alter the course of the randomized clinical trials and jeopardize comparative effectiveness evaluation and consequently decision making in regulatory science. The standard approach of intention-to-treat analysis ignores intercurrent events and thus preserves the trial validity based on randomization, but it fails to capture treatment effect heterogeneity and the complex causal mechanism. The 2018 ICH E9(R1) addendum suggests principal stratification as an alternative approach to handle intercurrent events, but significant gaps exist between the theory and practice of principal stratification in regulatory science. In particular, there is a lack of transparent and accessible analytical methods, practical guidelines, and software of principal stratification in the context of regulatory science.

Project Description and Goals

This project aims to develop a suite of transparent and accessible analysis tools, software and educational material for applying the principal stratification method to analyze intercurrent events in clinical trials. Investigators will focus on two prevalent types of intercurrent events: (1) nonadherence to assigned treatment, including treatment switching and discontinuation and (2) truncation of the target outcome by a terminal event. For each type, investigators will develop estimand, computational, visualization, and sensitivity analysis tools, with a special emphasis on time-to-event outcomes. They will also develop a companion R package and tutorials with illustrations of clinical trials in oncology and other diseases. The results of this study will impact clinical trials in two ways: (1) produce new methodological tools for addressing a pressing and prevalent complication in clinical trials, (2) provide associated open-source software and educational material to disseminate the methodology to regulatory agencies, health researchers, and industry. Investigators also plan to develop scientific publications describing the outcomes of this research and discuss it at public forums.

Research Outcomes/Results

Two hundred and twenty-three patients with a mean age of 65 years completed the survey. These patients preferred a higher chance of good biopsy outcomes, and a lower chance of erectile dysfunction caused by the treatment and urinary incontinence after treatment. The patients stated in the survey that they are willing to accept:

  • a 15.1%-point increase in erectile dysfunction caused by the treatment to achieve a 10%-point increase in a good biopsy outcome after HIFU ablation, and
  • an 8.5%-point increase in urinary incontinence for a 10%-point increase in a good biopsy.

Also, further analysis revealed that patients who thought their cancer was more aggressive were more willing to tolerate urinary incontinence. Younger men were willing to tolerate less erectile dysfunction risk than older men. Respondents with a greater than college level of education were less willing to tolerate erectile dysfunction or urinary incontinence.

Research Impacts

Incorporating patient preference information into decisions that FDA makes about regulating devices is one of the major goals of FDA’s Center for Devices and Radiological Health (CDRH). Study findings show that patients prefer specific outcomes related to prostate ablation therapies like HIFU. The study results may help inform the design and regulation of current and future prostate tissue ablation devices by providing information about outcomes that patients most desire.


  • PMID: 34677594; Citation: Wallach JD, Deng Y, McCoy RG, Dhruva SS, Herrin J, Berkowitz A, Polley EC, Quinto K, Gandotra C, Crown W, Noseworthy P, Yao X, Shah ND, Ross JS, Lyon TD. Real-world Cardiovascular Outcomes Associated With Degarelix vs Leuprolide for Prostate Cancer Treatment.  JAMA Netw Open. 2021;4(10):e2130587. doi:10.1001/jamanetworkopen.2021.30587 .
  • PMID: 36191949; Citation: Deng Y, Polley EC, Wallach JD, Dhruva SS, Herrin J, Quinto K, Gandotra C, Crown W, Noseworthy P, Yao X, Lyon TD, Shah ND, Ross JS, McCoy RG. Emulating the GRADE trial using real world data: retrospective comparative effectiveness study. BMJ . 2022 Oct 3;379:e070717. doi: 10.1136/bmj-2022-070717 .

Numbers, Facts and Trends Shaping Your World

Read our research on:

Full Topic List

Regions & Countries

  • Publications
  • Our Methods
  • Short Reads
  • Tools & Resources

Read Our Research On:

Teens and Video Games Today

  • Methodology

Table of Contents

  • Who plays video games?
  • How often do teens play video games?
  • What devices do teens play video games on?
  • Social media use among gamers
  • Teen views on how much they play video games and efforts to cut back
  • Are teens social with others through video games?
  • Do teens think video games positively or negatively impact their lives?
  • Why do teens play video games?
  • Bullying and violence in video games
  • Appendix A: Detailed charts
  • Acknowledgments

The analysis in this report is based on a self-administered web survey conducted from Sept. 26 to Oct. 23, 2023, among a sample of 1,453 dyads, with each dyad (or pair) comprised of one U.S. teen ages 13 to 17 and one parent per teen. The margin of sampling error for the full sample of 1,453 teens is plus or minus 3.2 percentage points. The margin of sampling error for the full sample of 1,453 parents is plus or minus 3.2 percentage points. The survey was conducted by Ipsos Public Affairs in English and Spanish using KnowledgePanel, its nationally representative online research panel.

The research plan for this project was submitted to an external institutional review board (IRB), Advarra, which is an independent committee of experts that specializes in helping to protect the rights of research participants. The IRB thoroughly vetted this research before data collection began. Due to the risks associated with surveying minors, this research underwent a full board review and received approval (Approval ID Pro00073203).

KnowledgePanel members are recruited through probability sampling methods and include both those with internet access and those who did not have internet access at the time of their recruitment. KnowledgePanel provides internet access for those who do not have it and, if needed, a device to access the internet when they join the panel. KnowledgePanel’s recruitment process was originally based exclusively on a national random-digit dialing (RDD) sampling methodology. In 2009, Ipsos migrated to an address-based sampling (ABS) recruitment methodology via the U.S. Postal Service’s Delivery Sequence File (DSF). The Delivery Sequence File has been estimated to cover as much as 98% of the population, although some studies suggest that the coverage could be in the low 90% range. 4

Panelists were eligible for participation in this survey if they indicated on an earlier profile survey that they were the parent of a teen ages 13 to 17. A random sample of 3,981 eligible panel members were invited to participate in the study. Responding parents were screened and considered qualified for the study if they reconfirmed that they were the parent of at least one child ages 13 to 17 and granted permission for their teen who was chosen to participate in the study. In households with more than one eligible teen, parents were asked to think about one randomly selected teen, and that teen was instructed to complete the teen portion of the survey. A survey was considered complete if both the parent and selected teen completed their portions of the questionnaire, or if the parent did not qualify during the initial screening.

Of the sampled panelists, 1,763 (excluding break-offs) responded to the invitation and 1,453 qualified, completed the parent portion of the survey, and had their selected teen complete the teen portion of the survey, yielding a final stage completion rate of 44% and a qualification rate of 82%. The cumulative response rate accounting for nonresponse to the recruitment surveys and attrition is 2.2%. The break-off rate among those who logged on to the survey (regardless of whether they completed any items or qualified for the study) is 26.9%.

Upon completion, qualified respondents received a cash-equivalent incentive worth $10 for completing the survey. To encourage response from non-Hispanic Black panelists, the incentive was increased from $10 to $20 on Oct 5, 2023. The incentive was increased again on Oct. 10, from $20 to $40; then to $50 on Oct. 17; and to $75 on Oct. 20. Reminders and notifications of the change in incentive were sent for each increase.

All panelists received email invitations and any nonresponders received reminders, shown in the table. The field period was closed on Oct. 23, 2023.

A table showing Invitation and reminder dates

The analysis in this report was performed using separate weights for parents and teens. The parent weight was created in a multistep process that begins with a base design weight for the parent, which is computed to reflect their probability of selection for recruitment into the KnowledgePanel. These selection probabilities were then adjusted to account for the probability of selection for this survey, which included oversamples of non-Hispanic Black and Hispanic parents. Next, an iterative technique was used to align the parent design weights to population benchmarks for parents of teens ages 13 to 17 on the dimensions identified in the accompanying table, to account for any differential nonresponse that may have occurred.

To create the teen weight, an adjustment factor was applied to the final parent weight to reflect the selection of one teen per household. Finally, the teen weights were further raked to match the demographic distribution for teens ages 13 to 17 who live with parents. The teen weights were adjusted on the same teen dimensions as parent dimensions with the exception of teen education, which was not used in the teen weighting.

Sampling errors and tests of statistical significance take into account the effect of weighting. Interviews were conducted in both English and Spanish.

In addition to sampling error, one should bear in mind that question wording and practical difficulties in conducting surveys can introduce error or bias into the findings of opinion polls.

The following table shows the unweighted sample sizes and the error attributable to sampling that would be expected at the 95% level of confidence for different groups in the survey:

A table showing the unweighted sample sizes and the error attributable to sampling

Sample sizes and sampling errors for subgroups are available upon request.

Dispositions and response rates

The tables below display dispositions used in the calculation of completion, qualification and cumulative response rates. 5

A table showing Dispositions and response rates

© Pew Research Center, 2023

  • AAPOR Task Force on Address-based Sampling. 2016. “AAPOR Report: Address-based Sampling.” ↩
  • For more information on this method of calculating response rates, refer to: Callegaro, Mario, and Charles DiSogra. 2008. “Computing response metrics for online panels.” Public Opinion Quarterly. ↩

Sign up for our weekly newsletter

Fresh data delivery Saturday mornings

Sign up for The Briefing

Weekly updates on the world of news & information

  • Friendships
  • Online Harassment & Bullying
  • Teens & Tech
  • Teens & Youth

How Teens and Parents Approach Screen Time

Teens and internet, device access fact sheet, teens and social media fact sheet, teens, social media and technology 2023, what the data says about americans’ views of artificial intelligence, most popular, report materials.

1615 L St. NW, Suite 800 Washington, DC 20036 USA (+1) 202-419-4300 | Main (+1) 202-857-8562 | Fax (+1) 202-419-4372 |  Media Inquiries

Research Topics

  • Age & Generations
  • Coronavirus (COVID-19)
  • Economy & Work
  • Family & Relationships
  • Gender & LGBTQ
  • Immigration & Migration
  • International Affairs
  • Internet & Technology
  • Methodological Research
  • News Habits & Media
  • Non-U.S. Governments
  • Other Topics
  • Politics & Policy
  • Race & Ethnicity
  • Email Newsletters

ABOUT PEW RESEARCH CENTER  Pew Research Center is a nonpartisan fact tank that informs the public about the issues, attitudes and trends shaping the world. It conducts public opinion polling, demographic research, media content analysis and other empirical social science research. Pew Research Center does not take policy positions. It is a subsidiary of  The Pew Charitable Trusts .

Copyright 2024 Pew Research Center

Terms & Conditions

Privacy Policy

Cookie Settings

Reprints, Permissions & Use Policy

  • Search Menu
  • Advance Access
  • Collections
  • Author Guidelines
  • Submission Site
  • Open Access Policy
  • Self-Archiving Policy
  • Why Submit?
  • About Horticulture Research
  • About Nanjing Agricultural University
  • Editorial Board
  • Advertising & Corporate Services
  • Journals on Oxford Academic
  • Books on Oxford Academic

Nanjing Agricultural University

Article Contents

Introduction, conclusions, materials and methods, acknowledgements, author contributions, data availability, conflict of interest statement.

  • < Previous

Multi-omics analysis reveals key regulatory defense pathways and genes involved in salt tolerance of rose plants

These authors contributed equally to this work.

  • Article contents
  • Figures & tables
  • Supplementary Data

Haoran Ren, Wenjing Yang, Weikun Jing, Muhammad Owais Shahid, Yuming Liu, Xianhan Qiu, Patrick Choisy, Tao Xu, Nan Ma, Junping Gao, Xiaofeng Zhou, Multi-omics analysis reveals key regulatory defense pathways and genes involved in salt tolerance of rose plants, Horticulture Research , Volume 11, Issue 5, May 2024, uhae068, https://doi.org/10.1093/hr/uhae068

  • Permissions Icon Permissions

Salinity stress causes serious damage to crops worldwide, limiting plant production. However, the metabolic and molecular mechanisms underlying the response to salt stress in rose ( Rosa spp.) remain poorly studied. We therefore performed a multi-omics investigation of Rosa hybrida cv. Jardin de Granville (JDG) and Rosa damascena Mill. (DMS) under salt stress to determine the mechanisms underlying rose adaptability to salinity stress. Salt treatment of both JDG and DMS led to the buildup of reactive oxygen species (H 2 O 2 ). Palisade tissue was more severely damaged in DMS than in JDG, while the relative electrolyte permeability was lower and the soluble protein content was higher in JDG than in DMS. Metabolome profiling revealed significant alterations in phenolic acid, lipids, and flavonoid metabolite levels in JDG and DMS under salt stress. Proteome analysis identified enrichment of flavone and flavonol pathways in JDG under salt stress. RNA sequencing showed that salt stress influenced primary metabolism in DMS, whereas it substantially affected secondary metabolism in JDG. Integrating these datasets revealed that the phenylpropane pathway, especially the flavonoid pathway, is strongly enhanced in rose under salt stress. Consistent with this, weighted gene coexpression network analysis (WGCNA) identified the key regulatory gene chalcone synthase 1 ( CHS1 ), which is important in the phenylpropane pathway. Moreover, luciferase assays indicated that the bHLH74 transcription factor binds to the CHS1 promoter to block its transcription. These results clarify the role of the phenylpropane pathway, especially flavonoid and flavonol metabolism, in the response to salt stress in rose.

Rose ( Rosa spp.) is a popular ornamental crop that is also used in the cosmetics, perfume and medicine. Rose plants contains various bioactive substances, including flavonoids, fragrant components, and hydrolysable and condensed tannins, which have high value and market potential [ 1 ]. However, soil salinization is common in many rose-growing regions, and high salt concentrations in soil can severely inhibit rose plant growth, reduce flower quality, and cause significant economic losses [ 2 ]. Additionally, salt stress can enhance the secondary metabolites of roses such as citronellol, geraniol, and phenyl ethyl alcohol [ 3 , 4 ]. Such alterations in secondary metabolites may help to regulate the salt tolerance of rose. Research on roses has focused mainly on flower quality, petal development, and flower bloom [ 5–7 ], and there are limited data available regarding signaling pathways linking plant development and secondary metabolites associated with salt stress.

In plants, salt stress induces osmotic imbalances, which lead to the closure of leaf stomata, limit photosynthesis, and affect plant growth and metabolism [ 8 ]. To alleviate osmotic stress and protect themselves from its adverse effects, plants accumulate numerous compatible solutes (such as soluble proteins, soluble sugars, and proline), known collectively as osmoprotectants [ 9 ]. Moreover, plants generate reactive oxygen species (ROS) to cope with salt stress [ 10 ]. Nevertheless, excessive ROS accumulation can lead to oxidative DNA damage, affect protein biosynthesis, and ultimately result in cell damage and death [ 11 , 12 ]. Plant cells utilize both enzymatic and nonenzymatic antioxidant mechanisms to diminish ROS levels and prevent oxidative damage. Superoxide dismutase (SOD), peroxidase (POD), ascorbate peroxidase (APX), catalase (CAT), and glutathione peroxidase (GPX) are antioxidant enzymes that work as O 2− and H 2 O 2 scavengers [ 13 , 14 ]. Nonenzymatic antioxidants, such as ascorbate, glutathione, phenols, and flavonoids, also play vital roles in ROS scavenging [ 15 , 16 ].

Flavonoids are naturally occurring bioactive substances found in fruits, vegetables, tea, and medicinal plants [ 17 ]. Flavonoids comprise more than 9000 compounds and constitute a substantial category of plant secondary metabolites [ 18 ]. They have diverse biological functions in the growth and development of plants, including improving pollen fertility, imparting color, and influencing seed dormancy and germination [ 19 , 20 ]. In addition, flavonoids have protective roles against biotic and abiotic stresses, such as pathogen infections, ultraviolet (UV)-B, cold, drought, and salinity [ 21–23 ]. Flavonoids have also received widespread attention due to their possible benefits for human health [ 24 ].

The molecular mechanism of flavonoid biosynthesis has been elucidated in many plants [ 25 ]. Chalcone synthase (CHS) mediates the first step in flavonoid production, catalyzing the formation of naringenin chalcone from three molecules of malonyl CoA and one molecule of 4-coumaroyl CoA. Chalcone isomerase (CHI) then quickly converts naringenin chalcone into naringenin (flavanone), which is further biosynthesized into different flavonoids by the subsequent enzymes in this pathway [ 26 ]. Although the biosynthesis of flavonoids has attracted increasing attention from scholars, current research does not fully explain the effects of regulatory factors on the transcription and activity of the major enzymes in flavonoid metabolism. Therefore, further research on the signaling molecules and regulatory pathways associated with flavonoids, as well as their regulatory mechanisms, is needed to elucidate the physiological activity of flavonoids.

Rosa hybrida cv. Jardin de Granville (JDG) is a new hybrid rose developed by 'Les Roses Anciennes André Eve' for the Prestige range of Christian Dior skin care products. JDG possesses twice the vitality of a traditional rose and grows and blooms vigorously in the salty air and harsh winds of coastal climates. JDG is also rich in beneficial bioactive substances that are mainly used in cosmetics and anti-aging skin care creams [ 27 , 28 ]. Rosa damascena Mill. (DMS) is one of the most common fragrant roses in the Rosaceae family. Its essential oils and aromatic compounds are used extensively in the cosmetic and food industries worldwide [ 29 ]. DMS is considered an excellent rose throughout the world due to its high resistance to abiotic stress and abundance of beneficial secondary metabolites [ 30 ].

Here, we conducted an integrated analysis on the transcriptomes, proteomes, and metabolomes of JDG and DMS to explore the relationship between plant development and secondary metabolites of rose under salt stress. We used WGCNA and Cytoscape software to decipher the similarities and differences in the complex metabolic pathways and regulatory genes of JDG and DMS under salt stress. These results provide comprehensive information on the metabolic and molecular mechanisms of the response to salt stress in rose, promoting the cultivation of excellent new rose varieties that are both salt tolerant and rich in beneficial secondary metabolites.

JDG is more tolerant than DMS to salt stress

To explore the salt tolerance of rose, plants of JDG and DMS were treated with 400 mM NaCl for 2 weeks. DMS plants showed typical damage with yellowing and death of leaves, while JDG leaves only exhibited slight wilting ( Fig. 1A ). Additionally, detached rose leaves were treated with salt for 4 days; DMS leaves showed significantly more necrosis than JDG leaves ( Fig. 1B ). In order to quickly observe the response of rose cultivars to salt stress and convenience sampling, subsequent experiments mainly used detached rose leaves. To examine the overall anatomy and morphology of leaves treated for 2 days with NaCl, we stained treated and control leaves with toluidine blue and prepared thin sections. Palisade tissue damage in response to salt treatment was more severe in DMS than in JDG (indicated by red arrowheads in Fig. 1C ). To investigate ROS accumulation in response to salt stress, we performed 3, 3'-diaminobenzidine (DAB) staining. DMS leaves accumulated substantially more ROS (deeper staining) than JDG plants after salt stress, whereas there was no difference in ROS content between these two cultivars under normal conditions ( Fig. 1D, E ). Soluble protein content was higher in JDG leaves after 4 days of salt stress than after 2 days of salt stress, while the soluble protein content of DMS leaves was much higher than that of before treatment leaves after 2 days and decreased by 4 days of salt treatment ( Fig. 1F ). The relative electrolyte permeability of JDG leaves was increased slightly after 2 days of salt treatment and more substantially after 4 days of treatment, while relative electrolyte permeability was much higher in DMS than in JDG on both days after salt treatment ( Fig. 1G ). Phenotypic and physiological analyses indicated that JDG is more salt tolerant than DMS.

Phenotypes of JDG and DMS under salt stress. (A) Phenotypes of JDG and DMS plants after 2 weeks of treatment with 400 mM NaCl. Left, phenotype of the whole plant; right, enlarged image of the protruding part indicated by the red circle. Bars, 3 cm. (B) Detached leaves of rose on different days after onset of salt stress (400 mM NaCl). (C) Anatomical analysis of leaves in (B). Red arrowheads represent the palisade tissue. Mock (0 mM NaCl); NaCl (400 mM NaCl). Bars, 50 μm. (D) Tissue staining of rose leaves under salt stress using DAB. (E) Quantitative statistics of the relative staining intensity in (D). Brown staining area and total leaf area were measured using ImageJ software, their ratio is the relative staining intensity. (F) Soluble protein content of rose leaves at different days under salt treatment. (G) Relative electrolyte permeability of rose leaves at different days under salt treatment. Data are based on the mean ± SE of at least three repeated biological experiments.

Phenotypes of JDG and DMS under salt stress. (A) Phenotypes of JDG and DMS plants after 2 weeks of treatment with 400 mM NaCl. Left, phenotype of the whole plant; right, enlarged image of the protruding part indicated by the red circle. Bars, 3 cm. (B) Detached leaves of rose on different days after onset of salt stress (400 mM NaCl). (C) Anatomical analysis of leaves in (B). Red arrowheads represent the palisade tissue. Mock (0 mM NaCl); NaCl (400 mM NaCl). Bars, 50 μm. (D) Tissue staining of rose leaves under salt stress using DAB. (E) Quantitative statistics of the relative staining intensity in (D). Brown staining area and total leaf area were measured using ImageJ software, their ratio is the relative staining intensity. (F) Soluble protein content of rose leaves at different days under salt treatment. (G) Relative electrolyte permeability of rose leaves at different days under salt treatment. Data are based on the mean ± SE of at least three repeated biological experiments.

Flavonoid metabolites play an important role in the salinity tolerance of rose

To better understand how salt stress affects rose metabolites, we performed a comprehensive untargeted analysis of metabolites using ultra-performance liquid chromatography/mass spectrometry (UPLC/MS). Fig. S1A shows the different metabolites detected, and Fig. S1B shows the curves of the quality control samples, indicating that the mass spectral data were highly reproducible and reliable. Principal component analysis (PCA) was used to reduce the data dimensions and clarify the relationships among the samples. The two principal components PC1, and PC2 could explain 50.07% and 23.36% of the variance, respectively. Moreover, PC1 revealed variance in genotypes, while PC2 revealed differences in time of exposure to salt stress. Thus, the metabolite-based PCA revealed obvious differences in salt tolerance between the two cultivars ( Fig. S2A ).

Our screening for differentially accumulated metabolites (DAMs) identified hundreds of metabolites with significantly altered accumulation under salt stress ( Fig. 2A , Table S1 ). Preliminary analysis indicated that DAMs included amino acids and their derivatives, nucleotides and their derivatives, phenolic acids, flavonoids, lipids, tannins, lignans and coumarins, organic acids, alkaloids, and terpenoids, and most of the DAMs were upregulated under salt stress ( Fig. 2B ). Phenolic acids, lipids, and flavonoid metabolites showed significantly altered accumulation under salt stress in both JDG and DMS. Compared with their levels in DMS, flavonoid metabolites, phenolic acid metabolites, and lipids were differentially accumulated in JDG leaves under both control conditions and salt stress ( Table S1 ). These results indicate that flavonoid metabolites, phenolic acid metabolites, and lipids may play important roles in the salt tolerance of rose.

Metabolomic analysis of JDG and DMS under salt stress. (A) Number of DAMs in different comparison groups. (B) Classification of DAMs in each comparison. (C) Classification of DAMs upregulated in both JDG and DMS under salt treatment. (D) Classification of DAMs upregulated in JDG compared with DMS under both control and salt treatments. (E, F) KEGG pathway enrichment of DAMs under salt stress: (E) JDG-NaCl vs JDG-Mock and (F) DMS-NaCl vs DMS-Mock.

Metabolomic analysis of JDG and DMS under salt stress. (A) Number of DAMs in different comparison groups. (B) Classification of DAMs in each comparison. (C) Classification of DAMs upregulated in both JDG and DMS under salt treatment. (D) Classification of DAMs upregulated in JDG compared with DMS under both control and salt treatments. (E, F) KEGG pathway enrichment of DAMs under salt stress: (E) JDG-NaCl vs JDG-Mock and (F) DMS-NaCl vs DMS-Mock.

To determine how metabolites differ between JDG and DMS, we summarized the differences in metabolite accumulation in the different comparison groups using Venn diagrams. Groups JDG-NaCl vs JDG-Mock and DMS-NaCl vs DMS-Mock shared 109 of the same metabolite changes, of which 79 were increases and 15 were decreases. Among the upregulated metabolites, phenolic acids and flavonoids accounted for 21.52% and 7.59%, respectively. These metabolites included ferulic acid, coniferaldehyde, pinocembrin (dihydrochrysin), naringin, eucalyptin (5-hydroxy-7,4'-dimethoxy-6,8-dimethylflavone), patuletin (quercetagetin-6-methyl ether), naringenin-7- O -rutinoside-4'- O -glucoside, naringin (naringenin-7- O -neohesperidoside), and sudachitin ( Fig. 2C , Fig. S2B–D , Table S1 ). Notably, 5,7,8,4'-tetramethoxyflavone, vanillic acid-4- O -glucoside, and 3',4',5',5,7-pentamethoxyflavone were upregulated in JDG and downregulated in DMS under salt stress, while kaempferol-3- O -arabinoside-7- O -rhamnoside was upregulated in DMS and downregulated in JDG. Groups JDG-Mock vs DMS-Mock and JDG-NaCl vs DMS-NaCl shared 408 metabolites showing the same tendency in alteration, of which accumulation of 188 was increased and 202 was decreased. Among the upregulated metabolites, phenolic acids and flavonoids accounted for 29.26% and 33.51%, respectively ( Fig. 2D ). Notably, the genkwanin (apigenin 7-methyl ether) content was 12.74-fold higher, the 5,7-dihydroxy-6,3′,4′,5′-tetramethoxyflavone (arteanoflavone) content was 15.64-fold higher, the naringenin-4′,7-dimethyl ether content was 13-fold higher, and the naringin dihydrochalcone content was 13.30-fold in JDG compared with DMS under control conditions; all of these are flavonoid metabolites. Venn analysis also showed that many metabolites displaying changes under salt stress were genotype specific, indicating that the cultivars have different mechanisms of response to salinity. There were 77 metabolites that specifically accumulated in JDG under salt stress, which may represent the major metabolites in the salt stress response of JDG. Notably, four metabolites—ethylsalicylate (a phenolic acid), salidroside (a phenolic acid), L-ornithine (amino acids and derivatives), and epiafzelechin (a flavonoid)—accumulated specifically in JDG after salt treatment and were also highly accumulated under control conditions in JDG compared with DMS ( Fig. S2B–D , Table S1 ).

All DAMs were analyzed using Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment ( Fig. 2E, F , Fig. S3A, B ). In JDG (JDG-NaCl vs JDG-Mock group), salt stress induced changes in metabolites mainly involved 'purine metabolism,' 'phenylpropanoid biosynthesis,' 'linoleic acid metabolism,' and 'alpha-linolenic acid metabolism' ( Fig. 2E ). In DMS (DMS-NaCl vs DMS-Mock group), the DAMs in leaves under salt stress were mainly associated with 'phenylpropanoid biosynthesis,' 'alpha-linolenic acid metabolism,' 'linoleic acid metabolism,' and 'pentose and glucuronate interconversions' ( Fig. 2F ). In the JDG-Mock vs DMS-Mock group, DAMs between leaves of DMS and JDG were mostly associated with 'flavonoid biosynthesis,' 'flavone and flavonol biosynthesis,' and 'phenylpropanoid biosynthesis' ( Fig. S3A ). Meanwhile, in the JDG-NaCl vs DMS-NaCl group, DAMs were largely involved in 'flavonoid biosynthesis,' 'flavone and flavonol biosynthesis,' and 'linoleic acid metabolism' ( Fig. S3B ). KEGG enrichment analysis showed that 'linolenic acid/α-linolenic acid metabolism' and 'phenylpropanoid biosynthesis' were significantly enriched under salt stress in both cultivars, indicating that these two pathways play important roles under salt stress in rose. Regardless of the presence of salt stress, DAMs between DMS and JDG were concentrated in the flavone, flavonoid, and flavonol biosynthetic pathways, indicating that differential accumulation of these metabolites may be the main reason for different salt sensitivities among rose cultivars. Notably, 'caffeine metabolism' was enriched in JDG, while 'starch and sucrose metabolism' was significantly increased in DMS.

Salt stress causes dynamic changes in distinct sets of proteins

To delve deeper into the molecular mechanisms of the salt stress response in rose plants, we performed a proteome profiling analysis under the same salt treatment and control conditions as the metabolome analysis and characterized proteins on the basis of fold changes in their accumulation level. We identified 119 (87 upregulated and 32 downregulated) and 163 (83 downregulated and 80 upregulated) proteins with significantly differential accumulation under salt stress in JDG and DMS, respectively ( Fig. 3A, B ). Only 18 differentially accumulated proteins (DAPs) overlapped between the two cultivars, of which 13 were upregulated and 4 were downregulated in both JDG and DMS, while one DUF1279 domain–containing protein was upregulated in JDG and downregulated in DMS. Moreover, 101 DAPs were unique to JDG, whereas 145 DAPs were unique to DMS ( Table S2 ).

Proteomic analysis of rose under salt stress. (A) Number of DAPs in JDG and DMS. (B) Venn diagram of the DAPs in JDG and DMS. (C) Localizations of DAPs identified in JDG. (D) Functional categorization of DAPs unique to JDG. (E, F) KEGG enrichment analysis of DAPs in JDG (upregulated, E) and DMS (upregulated, F).

Proteomic analysis of rose under salt stress. (A) Number of DAPs in JDG and DMS. (B) Venn diagram of the DAPs in JDG and DMS. (C) Localizations of DAPs identified in JDG. (D) Functional categorization of DAPs unique to JDG. (E, F) KEGG enrichment analysis of DAPs in JDG (upregulated, E) and DMS (upregulated, F).

We predicted that most of the DAPs are located in chloroplasts in rose, according to the WoLFPSORT database ( Fig. 3C , Fig. S4A ). Gene Ontology (GO) and KEGG analyses were performed to analyze and annotate protein functions. The 20 most highly enriched GO terms associated with the DAPs are depicted in a circle diagram ( Fig. S5A, B , Table S2 ). Among them, GO:0046658 (anchored component of plasma membrane), GO:0051554 (flavonol metabolic process), GO:0047893 (flavonol 3- O -glucosyltransferase activity), and GO:0051555 (flavonol biosynthetic process) were highly enriched in JDG under salt stress. In DMS, GO:0006720 (isoprenoid catabolic process), GO:0005764 (lysosome), and GO:0004602 (glutathione peroxidase activity) were the most enriched among all GO terms. In addition, the GO data indicated that the DAPs specific to JDG were highly involved in the 'icosanoid metabolic process,' 'diterpenoid metabolic process,' and 'diterpenoid biosynthetic process' ( Fig. 3D ), whereas the DAPs specific to DMS were enriched in 'cellular hyperosmotic salinity response,' 'monocarboxylic acid catabolic process,' 'terpenoid catabolic process,' 'sesquiterpenoid catabolic process,' and 'apocarotenoid catabolic process' functions ( Fig. S4B ). DAPs shared by JDG and DMS included Q2VA35 (xyloglucan endotransglucosylase/hydrolase) and A0A2P6P708 (glutathione peroxidase), which are present only in extracellular regions ( Table S2 ). The DAPs in different comparison groups were classified and then clustered according to enrichment of their associated GO terms ( Fig. S4C ). We determined that salinity mainly influences flavone and flavonol metabolism pathways in JDG. Flavones and flavonols are antioxidants and bioactive reagents [ 24 ]. In DMS, salt mainly influences the osmotic response, water stimulus response, and salt stress response pathways, most of which are stress related [ 31 ]. We used KEGG enrichment to determine the metabolic pathways associated with the DAPs in JDG and DMS under salt stress ( Fig. 3E, F ). Many DAPs in JDG were associated with phenylpropanoid biosynthesis and alpha-linolenic acid metabolism, with examples including lipoxygenase (A0A2P6S713), 12-oxophytodienoate reductase (A0A2P6PFD8), peroxidase (A0A2P6R8H8), and flavone 3′- O -methyltransferase (A0A2P6RK21). The DAPs upregulated in DMS under salt stress were frequently associated with alpha-linolenic acid metabolism and glutathione metabolism, whereas the DAPs that were downregulated were associated with ribosomes ( Table S2 ). Notably, alpha-linolenic acid metabolism was significantly upregulated in both JDG and DMS under salt stress. Collectively, the GO and KEGG enrichment results show that salt stress causes dynamic changes in distinct sets of proteins in rose.

Salt stress differentially alters the transcriptomes of JDG and DMS

To identify the genes involved in salt stress and explore the molecular mechanisms of salt tolerance in DMS and JDG, we sequenced the transcriptomes of JDG and DMS leaves by RNA sequencing (RNA-seq). We obtained high-quality reads for transcriptome analysis ( Table S3 ). PCA showed a distinct difference between the two cultivars along PC1, and PC2 separated the treatment from the control. The three biological replicates in the ordination space were mostly clustered together, suggesting an acceptable correlation between replicates ( Fig. 4A ).

Transcriptomic analysis of JDG and DMS under salt stress. (A) PCA score plot of transcriptomic profiles from different cultivars. (B) Number of DEGs in JDG and DMS. (C–E) Venn diagrams of DEGs in JDG and DMS: (C) total DEGs, (D) upregulated DEGs, and (E) downregulated DEGs. (F, G) KEGG enrichment analysis of DEGs in JDG (F) and DMS (G).

Transcriptomic analysis of JDG and DMS under salt stress. (A) PCA score plot of transcriptomic profiles from different cultivars. (B) Number of DEGs in JDG and DMS. (C–E) Venn diagrams of DEGs in JDG and DMS: (C) total DEGs, (D) upregulated DEGs, and (E) downregulated DEGs. (F, G) KEGG enrichment analysis of DEGs in JDG (F) and DMS (G).

Correlation analysis of transcriptome, proteome, and metabolomics data. (A, B) KEGG enrichment analysis of combined transcriptome, proteome, and metabolome data: (A) JDG-NaCl vs JDG-Mock, and (B) DMS-NaCl vs DMS-Mock. The x-axis shows the enrichment factor of the pathway in different omics, and the y-axis shows the name of the KEGG pathway; the color from red to green represents the significance of enrichment from high to low (indicated by the P value). The size of bubbles indicates the number of DEGs, DAPs, or DAMs; the larger the number, the larger the symbol. The shape of bubbles illustrates the various omics: circles represent genes omics, triangles represent metabolites omics, and squares represent proteins omics. (C) Co-expression network of major genes, proteins, and metabolites in the phenylpropanoid pathway. Different colors indicate the value of log2Fold Change (NaCl/Mock), with red for upregulated and blue for downregulated genes, proteins, or metabolites.

Correlation analysis of transcriptome, proteome, and metabolomics data. (A, B) KEGG enrichment analysis of combined transcriptome, proteome, and metabolome data: (A) JDG-NaCl vs JDG-Mock, and (B) DMS-NaCl vs DMS-Mock. The x-axis shows the enrichment factor of the pathway in different omics, and the y-axis shows the name of the KEGG pathway; the color from red to green represents the significance of enrichment from high to low (indicated by the P value). The size of bubbles indicates the number of DEGs, DAPs, or DAMs; the larger the number, the larger the symbol. The shape of bubbles illustrates the various omics: circles represent genes omics, triangles represent metabolites omics, and squares represent proteins omics. (C) Co-expression network of major genes, proteins, and metabolites in the phenylpropanoid pathway. Different colors indicate the value of log 2 Fold Change (NaCl/Mock), with red for upregulated and blue for downregulated genes, proteins, or metabolites.

We analyzed differentially expressed genes (DEGs) in JDG and DMS under control and salt stress conditions. We detected 10,662 DEGs in DMS under salt stress, of which 4651 were upregulated and 6011 were downregulated. However, only 1990 genes were differentially expressed in JDG: 1102 upregulated and 888 downregulated ( Fig. 4B ). The smaller number of DEGs in JDG than in DMS under salt stress implies that JDG is less affected by salt stress. We used a Venn diagram to display the differences between various genes in DMS and JDG under salt stress. Group DMS-NaCl vs DMS-Mock and group JDG-NaCl vs JDG-Mock shared 1120 DEGs under salt stress, with 577 upregulated genes and 433 downregulated genes ( Fig. 4C–E ).

Next, we performed GO analysis of DEGs in the categories cellular component (CC), biological process (BP), and molecular function (MF). The top 21 most enriched GO terms associated with DEGs of JDG-NaCl vs JDG-Mock and DMS-NaCl vs DMS-Mock are presented in circle diagrams ( Fig. S6 , Table S4 ). Seven GO terms associated with the JDG-NaCl vs JDG-Mock group were highly involved in the BP category, among which GO:0016052 (carbohydrate catabolic process), GO:0009813 (flavonoid biosynthetic process), and GO:0009812 (flavonoid metabolic process) contained the most DEGs (43, 26, and 27, respectively), and most of these enriched genes were upregulated. Thirteen GO terms were highly involved in the MF category, among which GO:0010427 (abscisic acid binding), GO:0016832 (aldehyde-lyase activity), and GO:0019840 (isoprenoid binding) were highly significant. One GO term was highly involved in the CC category: GO:0031226 (intrinsic component of plasma membrane). Moreover, 19 GO terms associated with the DMS-NaCl vs DMS-Mock group were enriched in the BP category, among which GO:0036294 (cellular response to decreased oxygen levels), GO:0048511 (rhythmic process), and GO:0048585 (negative regulation of response to stimulus) contained the most DEGs (85, 95, and 146, respectively), and most of these enriched genes were downregulated. One GO term was enriched in the MF category: GO:0016854 (racemase and epimerase activity). Similarly, one GO term was enriched in the CC category: GO:0009501 (amyloplast). KEGG pathway enrichment analysis for JDG-NaCl vs JDG-Mock revealed that the DEGs were mainly involved in metabolic pathways, plant hormone signal transduction, biosynthesis of secondary metabolites, and glycolysis/gluconeogenesis ( Fig. 4F , Table S4 ). In the DMS-NaCl vs DMS-Mock group, the DEGs were chiefly enriched in metabolic pathways, plant hormone signal transduction, the MAPK signaling pathway, biosynthesis of cofactors, and ubiquitin-mediated proteolysis ( Fig. 4G , Table S4 ). These findings indicate that the biosynthesis of secondary metabolites is substantially enhanced under salt stress in JDG, but not in DMS. However, the biosynthesis of cofactors associated with primary metabolism is enhanced under salt stress in DMS. Therefore, we speculate that salinity results in large changes in primary metabolism in DMS, while it influences secondary metabolism in JDG.

Transcription factors (TFs) are essential for regulating the expression of stress response genes. Among the DEGs, we identified 114 TFs in JDG and 491 TFs in DMS, covering 39 TF families ( Table S4 ). The most abundant genes belonged to the AP2/ERF-ERF, MYB, NAC, bHLH, and C2C2 families ( Fig. S7A, B ). Moreover, 64 TFs were differentially expressed in both cultivars in response to salinity. We speculate that these TFs form a highly complex transcriptional regulatory network and could perform critical functions in the mechanism of salt tolerance in rose.

Expression of phenylpropanoid-related genes is correlated with proteins and metabolites affected by salt stress

Integrated analysis of multi-omics data provides a powerful tool for identifying significantly different pathways and crucial metabolites in biological processes. Here, we integrated our transcriptome, proteome, and metabolome data to determine the performance of the two rose cultivars under salt stress. Pathways associated with alpha-linolenic acid metabolism, phenylpropanoid biosynthesis, and starch and sucrose metabolism were significantly enriched in JDG under salt stress ( Fig. 5A ), while the pathways enriched in DMS were involved in starch and sucrose metabolism, cyanoamino acid metabolism, and phenylpropanoid biosynthesis ( Fig. 5B ). Starch and sucrose metabolism represent primary metabolic functions common to different cultivars [ 32 ], while alpha-linolenic acid metabolism is related to the biosynthesis of jasmonic acid, which is a phytohormone involved in fungal invasion and senescence [ 7 ]. The phenylpropanoid biosynthesis pathway comprises multiple secondary metabolites, which confer a range of colors, flavors, nutritional components, and bioactivities in plants. Flavonoids are an important type of phenylpropanoid that play key roles in resistance against biotic and abiotic stresses [ 24 ]. Thus, we focused on the phenylpropanoid pathway.

Gene–protein–metabolite correlation networks can be used to elucidate functional relationships and identify regulatory factors. Therefore, we analyzed the regulatory networks of the DEGs, DAPs, and DAMs related to phenylpropanoid metabolism. We identified 14 DEGs that were strongly correlated with one DAP and six DAMs in JDG under salt stress. Similarly, 25 DEGs were strongly correlated with one DAP and eight DAMs in DMS under salt stress ( Table S5 ). For example, in JDG, there was a strong correlation between the expression of one gene (RchiOBHmChr4g0430951) and the abundance of one protein (A0A2P6PM56) and two metabolites [coniferyl alcohol (mws0093) and sinapyl alcohol (mws0853)]. Epiafzelechin (mws1422) was also significantly associated with the expression of the gene RchiOBHmChr2g0092641. In DMS, there was a close association between the expression of three genes (RchiOBHmChr2g0092671, RchiOBHmChr3g0480401, and RchiOBHmChr5g0041231) and the abundance of one protein (A0A2P6QM41) and one metabolite [L-tyrosine (mws0250)]. The strong association of particular genes with phenylpropanoid proteins or metabolites suggests that these genes play a major role in phenylpropanoid biosynthesis under salt stress.

We selected 20 important genes in the biosynthetic pathway of phenylpropanoid and compared their expression between rose cultivars ( Table S6 ). The transcript levels of many genes ( 4CL1 , CCR1 , HCT1 , HCT2 , HCT3 , HCT4 , CHS1 , CHS2 , CHI , DFR , F3H , and ANR ) were higher in JDG than in DMS, which may be valuable for salt tolerance by stimulating JDG to produce more flavonoids. Our multi-omics analysis revealed that ferulic acid, sinapic acid, and coniferaldehyde accumulated to high levels in JDG under salt stress ( Fig. 5C , Table S1 ). We also compared the flavonoid compounds in the two cultivars. Quercetin-3,3′-dimethyl ether, 5,7-dihydroxy-6,3′,4′,5′-tetramethoxyflavone (arteanoflavone), naringenin-4′,7-dimethyl ether, naringin dihydrochalcone, genkwanin (apigenin 7-methyl ether), and mearnsetin accumulated to greater levels in JDG than in DMS under control conditions. Correspondingly, the flavonoids brickellin, 3- O -methylquercetin, 5,2′,5′-trihydroxy-3,7,4′-trimethoxyflavone-2′- O -glucoside, and kaempferol-3- O -(6′′-acetyl)glucosyl-(1→3)-galactoside were more abundant in JDG than in DMS under salt stress. By contrast, naringenin-4′,7-dimethyl ether, aromadendrin (dihydrokaempferol), pinocembrin-7- O -(6′′- O -malonyl)glucoside, Quercetin-3- O -(2”- O -glucosyl)glucuronide, were specifically accumulated in DMS. Moreover, 3′,4′,5′,5,7-pentamethoxyflavone, 3,5,7,3′4′-pentamethoxyflavone, and 5,7,8,4′-tetramethoxyflavone were abundant in JDG under salt stress but were decreased in DMS ( Table S7 ). Overall, the integration of the three omics datasets indicated that the phenylpropane pathway, especially the flavonoid pathway, is strongly enhanced under salinity conditions and that this contributes to salt tolerance in roses, especially in the JDG genotype.

Networks of co-expressed genes associated with phenylpropanoid biosynthesis are involved in the salt stress response

To identify candidate genes associated with phenylpropanoid biosynthesis, we constructed co-expression gene network modules via weighted gene correlation network analysis (WGCNA). We constructed a cluster tree based on correlation between expression levels (indicated by fragments per kilobase of script per million fragments mapped, FPKM), which partitioned the genes into 11 different gene modules ( Fig. 6A, B ). To identify candidate genes that play significant roles within the gene networks, we extracted annotation information for all these genes from the Rosa chinensis 'Old Blush' reference genome annotation database. We selected 16 genes contributing to phenylpropanoid biosynthesis and four genes associated with flavonoid biosynthesis. Table S8 lists the annotated genes participating in flavonoid-related pathways in JDG. Among the 11 modules, the green module contained 10 of these genes: CHS1 , CHS2 , CCR1 , HCT3 , HCT4 , CCoAOMT , F3H , DFR , ANR , and CHI . The turquoise module contained three genes: CCR2 , HCT1 , and CAD2 . The blue module contained three genes: PRDX1 , 4CL1 , and ANS . The red, yellow, brown, and black modules each contained one gene: CAD1 , PRDX2 , HCT2 , and 4CL2 , respectively ( Table S8 ). After combining certain genes in modules and comparing them with the DEGs, we checked and confirmed these results using reverse-transcription quantitative PCR (RT-qPCR). The expression trends of eight DEGs from phenylpropanoid and flavonoid biosynthesis pathways matched the results of RNA-seq ( Fig. S8 ).

Co-expression network related to flavonoid biosynthesis. (A) Clustering tree based on the correlation between gene expression levels. (B) Module–sample relationships. Each row represents a gene module, with the same color in as (A); each column represents a sample; the boxes within the chart contain corresponding correlations and P values. (C–E) Networks built from correlations among structural genes and TFs. Circles represent genes, and the size of the circle represents the number of relationships between genes in the network and surrounding genes. Lines represent regulatory relationships between genes, and different colored lines represent different connection strengths: red, strong connections; green, weak connections. (F) Heat map depicting the expression profiles of 15 TF genes. The scale bar denotes the Fold change/(mean expression levels across the three treatment groups). The color indicates relative levels of gene expression, horizontal rows represent the different treatments in JDG, and vertical columns show the TFs. (G) Representative images of transient expression of bHLH74 and LUC driven by the CHS1 promoter in Nicotiana benthamiana leaves. The color scale represents the signal level. High represents a strong signal, and low represents a weak signal. (H) Relative value of LUC/REN. Data are based on the mean ± SE of at least three repeated biological experiments. Significance determined using Student’s t-test (**P < 0.01).

Co-expression network related to flavonoid biosynthesis. (A) Clustering tree based on the correlation between gene expression levels. (B) Module–sample relationships. Each row represents a gene module, with the same color in as (A); each column represents a sample; the boxes within the chart contain corresponding correlations and P values. (C–E) Networks built from correlations among structural genes and TFs. Circles represent genes, and the size of the circle represents the number of relationships between genes in the network and surrounding genes. Lines represent regulatory relationships between genes, and different colored lines represent different connection strengths: red, strong connections; green, weak connections. (F) Heat map depicting the expression profiles of 15 TF genes. The scale bar denotes the Fold change/(mean expression levels across the three treatment groups). The color indicates relative levels of gene expression, horizontal rows represent the different treatments in JDG, and vertical columns show the TFs. (G) Representative images of transient expression of bHLH74 and LUC driven by the CHS1 promoter in Nicotiana benthamiana leaves. The color scale represents the signal level. High represents a strong signal, and low represents a weak signal. (H) Relative value of LUC/REN. Data are based on the mean ± SE of at least three repeated biological experiments. Significance determined using Student’s t -test ( ** P < 0.01).

To determine the regulatory genes involved in phenylpropanoid biosynthesis in JDG, we constructed three subnetworks from the different modules using the 20 phenylpropanoid biosynthesis–related DEGs as the nodes ( Table S9 ). In the regulatory networks of phenylpropanoid biosynthesis, we identified 15 TF genes from seven TF families: AP2/ERF-ERF (5 unigenes), bHLH (3 unigenes), MYB (3 unigenes), Alfin-like (1 unigene), SBP (1 unigene), C2C2-GATA (1 unigene), and TCP (1 unigene). bHLH62 and bHLH74 were strongly associated with CHS1 , CHS2 , CHI , CCR1 , and F3H ; ERF81 was strongly associated with 4CL1 ; and ERF110 and MYB-related were strongly associated with 4CL2 ( Fig. 6C–E ), indicating that CHS and 4CL are the major target genes in phenylpropanoid biosynthesis. Therefore, we speculated that the abundance of flavonoids is increased by enhancing the expression of upstream flavonoid biosynthesis genes. Fig. 6F shows a heat map of expression of the 15 TF genes after NaCl treatment. The green module contained a substantial number of phenylpropanoid biosynthesis genes, among which CHS1 was closely related to the TFs bHLH74 and bHLH62. Therefore, dual-luciferase reporter assays were conducted to determine their regulatory relationship ( Fig. 6G, H ). We used bHLH74 and bHLH62 driven by the CaMV35S promoter as effectors in a transient expression system, with the CHS1 promoter fused with LUC as a reporter. When we cotransformed Nicotiana benthamiana leaves with the effectors and the reporter, the LUC/REN ratio of CHS1 was 0.3/1, which was drastically lower than those of the controls ( Fig. 6G, H , Fig. S9A, B ). These results indicate that bHLH74, but not bHLH62, inhibits the expression of CHS1 .

Salt stress damages the structure and osmotic potential of rose leaves

Roses belong to the Rosaceae family and are one of the most important commercial flower crops. Extracts from various parts of the rose plant have also been shown to have excellent biological activity and are used in industries such as cosmetics, perfume and medicine [ 1 ]. Meanwhile, an increasing number of wild rose varieties with significant health benefits are being domesticated and brought into mainstream cultivation [ 33 ]. Salt stress is one of the most widespread abiotic constraints for rose cultivation. Salt stress threatens plant survival and growth but can stimulate an increase in the biosynthesis of secondary metabolites [ 34 ]. Previous studies have shown that optimal coordination between leaf structure and photosynthetic processes is essential for enabling plants to tolerate salt stress [ 35 ]. When exposed to salt treatment, leaves become thicker and smaller while the palisade tissue and spongy tissue become loose and jumbled and the intercellular space of the mesophyll becomes thinner [ 36–39 ]. We observed that the palisade tissue of DMS was loose, disordered, and severely damaged compared with that in JDG under salt stress ( Fig. 1C ). This indicates that DMS is more sensitive to salt stress than JDG. Typically, excessive ROS accumulate under stress conditions, which can lead to membrane oxidative damage (lipid peroxidation) [ 40 ]. Silencing of the gene GmNAC06 in soybean ( Glycine max ) leads to accumulation of ROS under salt stress, which in turn leads to significant losses in soybean production [ 41 ]. In Arabidopsis , the sibp1 mutant accumulates more ROS than wild-type plants or AtSIBP1-overexpressing plants, resulting in a lower survival rate under salt treatment [ 42 ]. In this study, salinity led to a greater accumulation of ROS in DMS compared with JDG, as detected by DAB staining ( Fig. 1D, E ). This indicates that DMS suffers greater damage under salinity stress. Excessive accumulation of ROS in cells can lead to membrane oxidative damage and trigger the production of enzyme systems or non-enzyme free radical scavengers to cope with oxidative damage [ 10 ]. Here, antioxidant enzyme activities such as peroxidase (A0A2P6R8H8) and glutathione peroxidase (A0A2P6P708) were upregulated in roses under salt treatment ( Table S2 ). This suggests that rose plants maintain lower ROS levels by upregulating the activity of antioxidant enzymes, thereby protecting photosynthetic mechanisms and maintaining plant growth under salt stress. Among the nonenzymatic antioxidants, phenols and flavonoids accumulate in various tissues and contribute to free radical scavenging that enhances plant salt tolerance [ 43 ]. Indeed, we identified significant differences in the contents of phenolic acids, lipids, and flavonoid metabolites in JDG and DMS under control and salt stress conditions ( Table S1 ). Moreover, our transcriptomic and proteomic analysis revealed the activation of genes and proteins within the phenylpropanoid and flavonol pathways. This activation results in the accumulation of various phenolic compounds, potentially enhancing their capacity for scavenging ROS.

Flavonoids are beneficial for improving salt stress in rose

Phenolic compounds, such as flavonoids, are among the most widespread secondary metabolites observed throughout the plant kingdom [ 44 ]. These compounds fulfill various biochemical and molecular functions within plants, encompassing roles in plant defense, signal transduction, antioxidant action, and the scavenging of free radicals [ 45 ]. Environmental changes commonly trigger the flavonoid pathway, which aids in shielding plants from the harmful effects of ultraviolet radiation, salt, heat, and drought [ 23 , 46 , 47 ]. Moreover, flavonoids demonstrate potent biological activity and serve as significant antioxidants [ 48 ]. Recently, researchers and consumers have been interested in plant-based polyphenols and flavonoids for their antioxidant potential, their dietary accessibility, and their role in preventing fatal diseases such as cardiovascular disease and cancer [ 49 ]. Our transcriptomics analysis showed that salinity causes significant alterations in the secondary metabolism of JDG, while affecting the primary metabolism of DMS. Proteomics showed that phenylpropanoid biosynthesis is significantly enhanced in JDG under salt stress, especially through the flavonoid pathway. In DMS, glutathione metabolism is significantly enhanced under salt stress, indicating differences in salt tolerance pathways between the two cultivars. Our metabolome data indicated that the abundance of phenolic acid and flavonoid metabolites was significantly altered in both JDG and DMS under salt stress. Furthermore, by comparing their contents in leaves under salt stress and control conditions, we found that more flavonoids accumulated in DMS than in JDG under salt stress. This evidence suggests that DMS requires an increased presence of flavones to withstand the damage caused by salinity. By contrast, salinity stress did not trigger a substantial buildup of flavonoids in JDG, possibly due to the adequate levels of flavonoids already present under normal conditions, which provided ample tolerance to salt-induced stress. This observation could also explain the higher tolerance of JDG to salt stress ( Table S1 ). When we compared the flavonoid metabolites of the phenylpropanoid pathway to identify flavonoid metabolites associated with salt tolerance, we found that 17 phenolic acid metabolites and 6 flavonoid metabolites were significantly differentially accumulated in both genotypes. Of these compounds, ferulic acid serves as a free radical scavenger, while simultaneously serving as an inhibitor for enzymes engaged in generating free radicals and boosting the activity of scavenger enzymes [ 49 ]. Sinapic acid is a bioactive phenolic acid with anti-inflammatory and anti-anxiety effects [ 50 ]. Pinocembrin, a naturally occurring flavonoid found in fruits, vegetables, nuts, seeds, flowers, and tea, is an anti-inflammatory, antimicrobial, and antioxidant agent [ 51 ]. This indicates that these two rose cultivars contain beneficial metabolites with some economic value. We investigated the possible effects of these metabolites in conferring salt tolerance in rose by comparing specific DAMs between JDG and DMS. Among these DAMs, eight metabolites were upregulated and six metabolites were downregulated under salt treatment in JDG compared to DMS. Among these eight upregulated DAMs, the contents of 3- O -methylquercetin, brickellin, 5,2′,5′-trihydroxy-3,7,4′-trimethoxyflavone-2′- O -glucoside, and kaempferol-3- O -(6′′-acetyl)glucosyl-(1→3)-galactoside accumulated significantly with salinity ( Table S7 ). These metabolites have important functions. For example, 3- O -methylquercetin has potent anticancer, antioxidant, antiallergy, and antimicrobial activities and shows strong antiviral activity against tomato ringspot virus [ 52 ]. Kaempferol, a biologically active compound found in numerous fruits, vegetables, and herbs, demonstrates various pharmacological benefits, such as antimicrobial, antioxidant, and anticancer properties [ 53 ]. This indicates that JDG is an excellent rose cultivar that is both salt tolerant and rich in beneficial bioactive substances.

bHLHL74 regulates flavonoid biosynthesis

The biosynthesis of flavonoids is initiated from the amino acid phenylalanine, giving rise to phenylpropanoids that subsequently enter the flavonoid-anthocyanin pathway [ 25 ]. The CHS enzyme is situated at a crucial regulatory position preceding the flavonoid biosynthetic pathway, directing the flow of the phenylpropanoid pathway towards flavonoid production, which has been extensively documented in many plant species [ 54 , 55 ]. In rice ( Oryza sativa ), defects in the flavonoid biosynthesis gene CHS can alter the distribution of flavonoids and lignin [ 56 ]. In eggplant ( Solanum melongena L.), CHS regulates the content of anthocyanins in eggplant skin under heat stress [ 57 ]. In apple ( Malus domestica ), overexpression of CHS increases the accumulation of flavonoids and enhances nitrogen absorption [ 58 ]. We identified a positive correlation between flavonoid accumulation and the expression of CHS genes, in agreement with previous reports. The bHLH TFs involved in regulating flavonoid biosynthesis work in a MYB-dependent or -independent manner. For example, DvIVS, a bHLH transcription factor in dahlia ( Dahlia variabilis ), activates flavonoid biosynthesis by regulating the expression of Chalcone synthase 1 ( CHS1 ) [ 59 ]. The Arabidopsis bHLH proteins TRANSPARENT TESTA 8 (AtTT8) and ENHANCER OF GLABRA 3 (AtEGL3) are all involved in the biosynthesis of various flavonoids [ 60–62 ]. In Chrysanthemum ( Chrysanthemum morifolium ), CmbHLH2 significantly activates CmDFR transcription, leading to anthocyanin accumulation, especially when in coordination with CmMYB6 [ 63 ]. In blueberry ( Vaccinium sect. Cyanococcus ), the bHLH25 and bHLH74 TFs potentially engage with MYB or directly hinder the expression of genes responsible for flavonoid biosynthesis, thereby regulating flavonoid accumulation [ 64 ]. In apple ( Malus domestica ), expression of bHLH62, bHLH74, and bHLH162 is significantly negatively correlated with anthocyanin content and has been shown to inhibit anthocyanin biosynthesis [ 65 ]. In apple fruit skin, hypermethylation of bHLH74 in the mCG context leads to transcriptional inhibition of downstream anthocyanin biosynthesis genes [ 66 ]. In rose, our co-expression network revealed a strong correlation between CHS and genes encoding TFs such as bHLH74 and bHLH62 in the key gene network. bHLH proteins can bind to the promoter regions of pivotal genes encoding enzymes, playing important roles in regulating DAMs under salt stress. Dual-luciferase reporter assays showed that LUC bioluminescence was suppressed well below background levels in Nicotiana benthamiana leaves infiltrated with pCHS1:LUC plus 35S:bHLH74, but not 35S:bHLH62 ( Fig. 6G, H , Fig. S9A, B ). Thus, we conclude that bHLHL74 TFs negatively regulate flavonoid biosynthesis by directly inhibiting the expression of CHS1 , which is involved in the flavonoid biosynthetic pathway.

We examined the morphological phenotypes, transcriptomes, proteomes, and widely targeted metabolomes of JDG and DMS under salt stress. Multi-omics analysis revealed that the phenylpropane pathway, especially the flavonoid pathway, contributes strongly to salt tolerance in rose, particularly JDG. Meanwhile, the bHLHL74 TF negatively regulates flavonoid biosynthesis by repressing the expression of the CHS1 gene involved in the flavonoid biosynthetic pathway. This research facilitates our understanding of the regulatory mechanisms of plant development and secondary metabolites underlying salt stress responses in rose, offering valuable insights that could be used to develop new strategies for improving plant tolerance to salinity.

Plant materials and growth conditions

Rosa hybrida cv. Jardin de Granville (JDG) and Rosa damascena Mill. (DMS) were planted in the Science and Technology Park of China Agricultural University (40°03′N, 116°29′E). Rose plants were propagated by cutting culture. Rose shoots with at least two nodes and approximately 6 cm in length were used as cuttings and inserted into square flowerpots (diameter 8 cm) containing a mixture of vermiculite and peat soil [1:1 (v/v)]. Cuttings were soaked in 0.15% (v/v) indole-3-butytric acid (IBA) before insertion into pots and then grown in a growth chamber at 25°C with 50% relative humidity and a cycle of 8 hours of darkness/16 hours of light for 1 month until rooting [ 67 ].

Nicotiana benthamiana plants were used for measurement of transient expression. Seeds were sown in square flowerpots (diameter 8 cm); after 1 week, seedlings were transplanted into different pots. The soil and cultivation conditions for N. benthamiana cultivation were the same as those for roses.

Salt treatment

Twenty JDG and 20 DMS rose cuttings displaying good rooting and uniform appearance were selected for salt treatment experiments. JDG or DMS plants were randomly divided into two groups watered with either 0 or 400 mM NaCl. Phenotypes were recorded after 2 weeks. This process was repeated three times [ 68 ].

Salt treatment of rose leaves was described previously [ 68 ]. Thirty JDG and 30 DMS rose cuttings with good rooting and uniform appearance were selected, and mature leaves of similar size were collected. The leaves were divided into two treatment groups, each containing 30 leaves: group A, immersed in deionized water treatment, and group B, immersed in 400 mM NaCl treatment. Phenotypes were observed after 0, 2, and 4 days. On the second day of treatment, leaves showed obvious differences. By the fourth day of treatment, the leaves had become soft or had died. Therefore, sequencing data from the second day were used. Three independent biological replicates were assayed.

Relative electrolyte permeability

Determination of relative electrolyte permeability was as previously reported [ 69 ] with the following modifications. Salt-treated leaves (0.1 g) were weighed, placed in a 50-ml centrifuge tube, and covered with 20 ml deionized water. The conductivity of the distilled water was measured and defined as EC0. After shaking for 20 minutes at 60 rpm on an orbital shaker, the conductivity at room temperature was measured and defined as EC1. The centrifuge tube was then placed in boiling water for 10 minutes and cooled to room temperature, and the conductivity of the solution was measured as EC2. The relative permeability of the electrolytes (as a percentage) was determined as (EC1-EC0) / (EC2-EC0) × 100%.

Soluble protein content

Soluble protein content was determined following the method of Bradford (1976) [ 70 ]. Leaf samples (0.5 g) were placed in a mortar with 8 ml distilled water and a small amount of quartz sand, crushed thoroughly, and incubated at room temperature for 0.5 hours. After centrifugation at 3,000 g for 20 minutes at 4 °C, the supernatant was transferred to a 10-ml volumetric flask and the volume was adjusted to 10 ml with distilled water. Two 1.0-ml aliquots of this sample extraction solution (or distilled water as a control) were transferred to clean test tubes, 5 ml of Coomassie Brilliant Blue reagent was added, and the tubes were shaken well. After 2 minutes, when the reaction was complete, the absorbance and chromaticity at 595 nm were measured, and the protein content was determined using a standard curve.

Leaf anatomical structure

Paraffin sections were prepared as described previously with some modifications [ 71 ]. Leaves from the control and NaCl treatments were collected, washed slowly with deionized water at normal room temperature, and stored at 4°C until further use. A 3-mm × 5-mm sample was cut from the same part of each leaf, and these leaf samples were fixed in 2.5% (v/v) glutaraldehyde. Samples were dehydrated using acetone through a concentration gradient of 30%, 50%, 70%, 80%, 95%, and 100% (v/v) and then embedded in paraffin. The embedded tissues (3-μm sections) were sectioned using a Leica RM2265 rotary slicer (Leica Microsystems, Wetzlar, Germany). Slides were stained with 0.02% (v/v) toluidine blue for 5 minutes, and the residual toluidine blue was removed using distilled water. Slides were allowed to dry and then observed under a microscope (OLYMPUS BH-2, Tokyo, Japan). Three independent biological replicates were examined.

DAB (3,3′-diaminobenzidine) staining for H 2 O 2

H 2 O 2 content was detected using the DAB staining method [ 72 ]. Leaves treated with NaCl or control leaves were rinsed clean with distilled water, immersed in DAB solution (1 mg/ml, pH 3.8), and placed under vacuum at approximately 0.8 Mpa for 5 minutes; this process was repeated three to six times until the leaves were completely infiltrated. Leaves were then incubated in a box in the dark for 8 hours until a brown sediment was observed. Chlorophyll was removed by repeatedly washing with eluent (ethanol:lactic acid:glycerol, 3:1:1, v/v/v). Decolorized leaves were photographed to record their phenotypes. ImageJ was used to quantify the stained areas.

UPLC-QQQ-based widely targeted metabolome analysis

Metabolomics analysis was performed on four groups of samples: JDG-Mock, JDG-NaCl, DMS-Mock, and DMS-NaCl. Extraction and determination of metabolites were performed with the assistance of Wuhan Metware Biotechnology Co., Ltd. Samples were crushed using a stirrer containing zirconia beads (MM 400, Retsch). Freeze-dried samples (0.1 g) were incubated overnight with 1.2 ml 70% (v/v) methanol solution at 4 °C, then centrifuged at 13,400 g for 10 minutes. The extracts were filtered and subjected to LC-MS/MS analysis [ 73 ]. A previously described procedure [ 74 ] was followed for analyzing the conditions and quantifying metabolites using an LC-ESI-Q TRAP-MS/MS in multi-reaction monitoring (MRM) mode. The prcomp function was used for PCA, significantly different metabolites were determined by |log 2 Fold Change| ≥ 1, and annotated metabolites were mapped to the KEGG pathway database ( http://www.kegg.jp/kegg/pathway.html ). Comparisons are described as follows: e.g., JDG-NaCl vs JDG-Mock, indicating that the treated sample is being compared with the untreated sample and that metabolites are upregulated or downregulated in the NaCl sample compared with the Mock sample.

Tandem mass tag-based proteomic analysis

Experiments were carried out with the assistance of Hangzhou Jingjie Biotechnology Co., Ltd. Samples were thoroughly ground into powder using liquid nitrogen, and protein extraction was performed using the phenol extraction method. The protein was added to trypsin for enzymolysis overnight, and then the peptide segments were labeled with TMT tags. LC-MS/MS analysis was performed using an EASY-nLC 1200 UPLC system (ThermoFisher Scientific) and a Q Active TM HF-X (ThermoFisher Scientific) [ 75 ]. An absolute value of 1.3 was used as the threshold for significant changes. GO ( http://www.ebi.ac.uk/GOA/ ) and KEGG categories were used to annotate DAPs; WoLFPSORT software was used to predict subcellular localization ( https://wolfpsort.hgc.jp/ ).

Transcriptome sequencing

We constructed 12 cDNA libraries (three biological replicates for each of JDG and DMS under each treatment) for RNA-seq. Transcriptome sequencing was completed at Wuhan Metware Biotechnology Co., Ltd. RNA purity and RNA integrity were determined using a nanophotometer spectrophotometer and an Agilent 2100 bioanalyzer, respectively. The RNA library was then sequenced on the Illumina Hiseq platform. Raw data were filtered using fastp v 0.19.3 and compared with the reference genome ( https://lipm-browsers.toulouse.inra.fr/pub/RchiOBHm-V2/ ). FPKM (fragments per kilobase of script per million fragments mapped) was used as an indicator to measure gene expression levels, with the threshold for significant differential expression being an absolute |log 2 Fold Change| ≥ 1 and False Discovery Rate < 0.05. GO and KEGG categories were used to annotate DEGs [ 76 ].

To identify modules with high gene correlation, co-expression network analysis was performed using the R-based WGCNA package (v.1.69) with default parameters [ 77 ]. The varFilter function of the R language genefilter package was used to remove genes with low or stable expression levels in all samples. Modules based on the correlation between gene expression levels were identified, and a correlation matrix between each module and the sample was calculated using the R-based WGCNA software package. The module network was visualized using Cytoscape software (v.3.7.2).

RT-qPCR was performed on eight DEGs in the phenylpropanoid pathway to verify the accuracy of the data obtained from high-throughput sequencing. Total RNA was extracted using the hot borate method [ 72 ] and reverse transcribed using HiScript III All-in-one RT SuperMix (R333-01, Vazyme Biotech Co., Ltd., Nanjing, China). Subsequently, 2 × ChamQ SYBR qPCR Master Mix (Q331, Vazyme Biotech Co., Ltd., Nanjing, China) was used for quantitative detection of gene expression. The relative expression of genes was calculated using the 2 −ΔΔCt method [ 76 ]. GAPDH was used as an endogenous control, and primers for RT-qPCR are listed in Table S10 .

Dual-LUC reporter assay

A transactivation assay was designed to evaluate the effect of BHLH74/BHLH62 on the CHS1 promoter using methods described previously [ 78 ]. Initially, a 2000-bp segment of the CHS1 promoter was cloned into the pGreenII 0800-LUC vector, generating the ProCHS1:LUC reporter plasmid. Concurrently, the coding sequences of BHLH74/BHLH62 were inserted into the pGreenII0029 62-SK vector, resulting in the construction of Pro35S: BHLH74/BHLH62 effector plasmids. pGreenII 0800-LUC vector containing REN under control of the 35S promoter was used as a positive control.

Following plasmid construction, these constructs were introduced into Agrobacterium tumefaciens strain GV3101, which harbored the pSoup plasmid. Subsequently, A. tumefaciens containing different combinations of effector and reporter plasmids was infiltrated into N. benthamiana plants with six to eight young leaves. After a 3-day incubation period, the ratios of LUC to REN were quantified using the Bio-Lite Luciferase Assay System (DD1201, Vazyme Biotech Co., Ltd., Nanjing, China). Images capturing LUC signals were acquired using a CCD camera (Night Shade LB 985, Germany). Primer sequences are listed in Table S10 .

Statistical analysis

Statistical analyses of data were conducted using IBM SPSS Statistics, while graphical representations were created using GraphPad Prism 8.0.1. Paired data comparisons were assessed through Student's t -tests ( * P < 0.05, ** P < 0.01, *** P < 0.001). Each experiment was performed using a minimum of three biological replicates, and error bars depicted on graphs denote the standard error (SE) of the mean value. The NetWare Cloud platform ( https://cloud.metware.cn ) and OmicShare tools ( https://www.chiplot.online/ ) were used for bioinformatics analyses and mapping.

This work was supported by the Consult of Flower Industry of Jinning District (202204BI090022), General Project of Shenzhen Science and Technology and Innovation Commission (Grant No. 6020330006K0).

ZX, MN conceived and designed the experiments. RH and YW conducted the experiments. RH, YW, ZX analyzed the data. LY, JW, QX, CP, XT, GJ and MN performed the research. RH, SM and ZX wrote the manuscript. All authors read and approved the manuscript. RH and YW contributed equally to this work.

The datasets generated and analyzed during the current study are available in the Biological Research Project Data (BioProject), National Center for Biotechnology Information (NCBI) repository, accession: PRJNA1030783.

The authors declare that they have no competing interests.

Mileva M , Ilieva Y , Jovtchev G . et al.  Rose flowers—a delicate perfume or a natural healer? Biomol Ther . 2021 ; 11 : 127

Google Scholar

Katsoulas N , Kittas C , Dimokas G . et al.  Effect of irrigation frequency on rose flower production and quality . Biosyst Eng . 2006 ; 93 : 237 – 44

Isah T . Stress and defense responses in plant secondary metabolites production . Biol Res . 2019 ; 52 : 39

Feng D , Zhang H , Qiu X . et al.  Comparative transcriptomic and metabonomic analysis revealed the relationships between biosynthesis of volatiles and flavonoid metabolites in Rosa rugosa . Ornam Plant Res . 2021 ; 1 : 1 – 10

Wang X , Zhao F , Wu Q . et al.  Physiological and transcriptome analyses to infer regulatory networks in flowering transition of Rosa rugosa . Ornam Plant Res . 2023 ; 3 : 1 – 12

Jia Y , Chen C , Gong F . et al.  An aux/IAA family member, RhIAA14 , involved in ethylene-inhibited petal expansion in rose ( Rosa hybrida ) . Genes . 2022 ; 13 : 1041

Ren H , Bai M , Sun J . et al.  RcMYB84 and RcMYB123 mediate jasmonate-induced defense responses against Botrytis cinerea in rose ( Rosa chinensis ) . Plant J . 2020 ; 103 : 1839 – 49

Chaves MM , Flexas J , Pinheiro C . Photosynthesis under drought and salt stress: regulation mechanisms from whole plant to cell . Ann Bot . 2009 ; 103 : 551 – 60

Askari Kelestani A , Ramezanpour S , Borzouei A . et al.  Application of gamma rays on salinity tolerance of wheat ( Triticum aestivum L.) and expression of genes related to biosynthesis of proline, glycine betaine and antioxidant enzymes . Physiol Mol Biol Plants . 2021 ; 27 : 2533 – 47

Qi S , Wang X , Wu Q . et al.  Morphological, physiological and transcriptomic analyses reveal potential candidate genes responsible for salt stress in Rosa rugosa . Ornam Plant Res . 2023 ; 3 :21

Gill SS , Tuteja N . Reactive oxygen species and antioxidant machinery in abiotic stress tolerance in crop plants . Plant Physiol Biochem . 2010 ; 48 : 909 – 30

Ye C , Zheng S , Jiang D . et al.  Initiation and execution of programmed cell death and regulation of reactive oxygen species in plants . Int J Mol Sci . 2021 ; 22 : 12942

He L , He T , Farrar S . et al.  Antioxidants maintain cellular redox homeostasis by elimination of reactive oxygen species . Cell Physiol Biochem . 2017 ; 44 : 532 – 53

Challabathula D , Analin B , Mohanan A . et al.  Differential modulation of photosynthesis, ROS and antioxidant enzyme activities in stress-sensitive and -tolerant rice cultivars during salinity and drought upon restriction of COX and AOX pathways of mitochondrial oxidative electron transport . J Plant Physiol . 2022 ; 268 :153583

Li C , Mur LAJ , Wang Q . et al.  ROS scavenging and ion homeostasis is required for the adaptation of halophyte Karelinia caspia to high salinity . Front Plant Sci . 2022 ; 13 :

Ren G , Yang P , Cui J . et al.  Multiomics analyses of two sorghum cultivars reveal the molecular mechanism of salt tolerance . Front Plant Sci . 2022 ; 13 :

Petrussa E , Braidot E , Zancani M . et al.  Plant Flavonoids--Biosynthesis, Transport and Involvement in Stress Responses . Int J Mol Sci . 2013 ; 14 : 14950 – 73

Das S , Rosazza JPN . Microbial and enzymatic transformations of flavonoids . J Nat Prod . 2006 ; 69 : 499 – 508

Gao Y , Liu J , Chen Y . et al.  Tomato SlAN11 regulates flavonoid biosynthesis and seed dormancy by interaction with bHLH proteins but not with MYB proteins . Hortic Res . 2018 ; 5 :

Zhang Z , Liu Y , Yuan Q . et al.  The bHLH1-DTX35/DFR module regulates pollen fertility by promoting flavonoid biosynthesis in Capsicum annuum L . Hortic Res . 2022 ; 9 :

Ramaroson M , Koutouan C , Helesbeux JJ . et al.  Role of Phenylpropanoids and flavonoids in plant resistance to pests and diseases . Molecules . 2022 ; 27 : 8371

Schulz E , Tohge T , Winkler JB . et al.  Natural variation among Arabidopsis accessions in the regulation of flavonoid metabolism and stress gene expression by combined UV radiation and cold . Plant Cell Physiol . 2021 ; 62 : 502 – 14

Wang F , Zhu H , Kong W . et al.  The antirrhinum AmDEL gene enhances flavonoids accumulation and salt and drought tolerance in transgenic Arabidopsis . Planta . 2016 ; 244 : 59 – 73

Shen N , Wang T , Gan Q . et al.  Plant flavonoids: classification, distribution, biosynthesis, and antioxidant activity . Food Chem . 2022 ; 383 :132531

Liu W , Feng Y , Yu S . et al.  The flavonoid biosynthesis network in plants . Int J Mol Sci . 2021 ; 22 : 12824

Zhang X , Abrahan C , Colquhoun TA . et al.  A proteolytic regulator controlling chalcone synthase stability and flavonoid biosynthesis in Arabidopsis . Plant Cell . 2017 ; 29 : 1157 – 74

Riffault-Valois L , Blanchot L , Colas C . et al.  Molecular fingerprint comparison of closely related rose varieties based on UHPLC-HRMS analysis and chemometrics . Phytochem Anal . 2017 ; 28 : 42 – 9

Riffault L , Destandau E , Pasquier L . et al.  Phytochemical analysis of Rosa hybrida cv. ‘Jardin de Granville' by HPTLC, HPLC-DAD and HPLC-ESI-HRMS: polyphenolic fingerprints of six plant organs . Phytochemistry . 2014 ; 99 : 127 – 34

Omidi M , Khandan-Mirkohi A , Kafi M . et al.  Biochemical and molecular responses of Rosa damascena mill. cv. Kashan to salicylic acid under salinity stress . BMC Plant Biol . 2022 ; 22 : 373

Azizi S , Seyed Hajizadeh H , Aghaee A . et al.  In vitro assessment of physiological traits and ROS detoxification pathways involved in tolerance of damask rose genotypes under salt stress . Sci Rep . 2023 ; 13 : 17795

Zhao S , Zhang Q , Liu M . et al.  Regulation of plant responses to salt stress . Int J Mol Sci . 2021 ; 22 : 4609

Zhang C , Zhang H , Zhan Z . et al.  Transcriptome analysis of sucrose metabolism during bulb swelling and development in onion ( Allium cepa L.) . Front Plant Sci . 2016 ; 7 :1425

Kumari P , Raju DVS , Prasad KV . et al.  Characterization of anthocyanins and their antioxidant activities in Indian rose varieties ( Rosa × hybrida ) using HPLC . Antioxidants . 2022 ; 11 : 2032

Akula R , Ravishankar GA . Influence of abiotic stress signals on secondary metabolites in plants . Plant Signal Behav . 2011 ; 6 : 1720 – 31

Barhoumi Z , Djebali W , Chaïbi W . et al.  Salt impact on photosynthesis and leaf ultrastructure of Aeluropus littoralis . J Plant Res . 2007 ; 120 : 529 – 37

Jiang D , Lu B , Liu L . et al.  Exogenous melatonin improves the salt tolerance of cotton by removing active oxygen and protecting photosynthetic organs . BMC Plant Biol . 2021 ; 21 : 331

Liu D , Dong S , Miao H . et al.  A large-scale genomic association analysis identifies the candidate genes regulating salt tolerance in cucumber ( Cucumis sativus L.) seedlings . Int J Mol Sci . 2022 ; 23 : 8260

Garrido Y , Tudela JA , Marín A . et al.  Physiological, phytochemical and structural changes of multi-leaf lettuce caused by salt stress . J Sci Food Agric . 2014 ; 94 : 1592 – 9

Yao X , Meng L , Zhao W . et al.  Changes in the morphology traits, anatomical structure of the leaves and transcriptome in Lycium barbarum L. under salt stress . Front Plant Sci . 2023 ; 14 :1090366

Tan Y , Duan Y , Chi Q . et al.  The role of reactive oxygen species in plant response to radiation . Int J Mol Sci . 2023 ; 24 : 3346

Li M , Chen R , Jiang Q . et al.  GmNAC06 , a NAC domain transcription factor enhances salt stress tolerance in soybean . Plant Mol Biol . 2021 ; 105 : 333 – 45

Wan X , Peng L , Xiong J . et al.  AtSIBP1 , a novel BTB domain-containing protein, positively regulates salt signaling in Arabidopsis thaliana . Plan Theory . 2019 ; 8 : 573

Rezayian M , Niknam V , Ebrahimzadeh H . Oxidative damage and antioxidative system in algae . Toxicol Rep . 2019 ; 6 : 1309 – 13

Liu X , Cheng X , Cao J . et al.  GOLDEN 2-LIKE transcription factors regulate chlorophyll biosynthesis and flavonoid accumulation in response to UV-B in tea plants . Hortic Plant J . 2023 ; 9 : 1055 – 66

Barreca D , Gattuso G , Bellocco E . et al.  Flavanones: citrus phytochemical with health-promoting properties . Biofactors . 2017 ; 43 : 495 – 506

Zhang F , Huang J , Guo H . et al.  OsRLCK160 contributes to flavonoid accumulation and UV-B tolerance by regulating OsbZIP48 in rice . Sci China Life Sci . 2022 ; 65 : 1380 – 94

Cui M , Liang Z , Liu Y . et al.  Flavonoid profile of Anoectochilus roxburghii (wall.) Lindl. Under short-term heat stress revealed by integrated metabolome, transcriptome, and biochemical analyses . Plant Physiol Biochem . 2023 ; 201 :107896

Dias MC , Pinto DCGA , Silva AMS . Plant flavonoids: chemical characteristics and biological activity . Molecules . 2021 ; 26 : 5377

Kumar S , Pandey AK . Chemistry and biological activities of flavonoids: an overview . Sci World J . 2013 ; 2013 : 1 – 16

Chen C . Sinapic acid and its derivatives as medicine in oxidative stress-induced diseases and aging . Oxidative Med Cell Longev . 2016 ; 2016 : 1 – 10

Rasul A , Millimouno FM , Ali Eltayb W . et al.  Pinocembrin: a novel natural compound with versatile pharmacological and biological activities . Biomed Res Int . 2013 ; 2013 : 1 – 9

Doneda E , Bianchi SE , Pittol V . et al.  3-O-methylquercetin from Achyrocline satureioides -cytotoxic activity against A375-derived human melanoma cell lines and its incorporation into cyclodextrins-hydrogels for topical administration . Drug Deliv Transl Res . 2021 ; 11 : 2151 – 68

Alam W , Khan H , Shah MA . et al.  Kaempferol as a dietary anti-inflammatory agent: current therapeutic standing . Molecules . 2020 ; 25 : 4073

Chen Y , Mao Y , Liu H . et al.  Transcriptome analysis of differentially expressed genes relevant to variegation in peach flowers . PLoS One . 2014 ; 9 :e90842

Duan B , Tan X , Long J . et al.  Integrated transcriptomic-metabolomic analysis reveals that cinnamaldehyde exposure positively regulates the phenylpropanoid pathway in postharvest Satsuma mandarin ( Citrus unshiu ) . Pestic Biochem Physiol . 2023 ; 189 :105312

Lam PY , Wang L , Lui ACW . et al.  Deficiency in flavonoid biosynthesis genes CHS , CHI , and CHIL alters rice flavonoid and lignin profiles . Plant Physiol . 2022 ; 188 : 1993 – 2011

Wu X , Zhang S , Liu X . et al.  Chalcone synthase (CHS) family members analysis from eggplant ( Solanum melongena L.) in the flavonoid biosynthetic pathway and expression patterns in response to heat stress . PLoS One . 2020 ; 15 :e0226537

Wang X , Chai X , Gao B . et al.  Multi-omics analysis reveals the mechanism of bHLH130 responding to low-nitrogen stress of apple rootstock . Plant Physiol . 2023 ; 191 : 1305 – 23

Ohno S , Hosokawa M , Hoshino A . et al.  A bHLH transcription factor, DvIVS , is involved in regulation of anthocyanin synthesis in dahlia ( Dahlia variabilis ) . J Exp Bot . 2011 ; 62 : 5105 – 16

Baudry A , Caboche M , Lepiniec L . TT8 controls its own expression in a feedback regulation involving TTG1 and homologous MYB and bHLH factors, allowing a strong and cell-specific accumulation of flavonoids in Arabidopsis thaliana . Plant J . 2006 ; 46 : 768 – 79

Gao C , Guo Y , Wang J . et al.  Brassica napus GLABRA3-1 promotes anthocyanin biosynthesis and trichome formation in true leaves when expressed in Arabidopsis thaliana . Plant Biol (Stuttg) . 2018 ; 20 : 3 – 9

Feyissa DN , Løvdal T , Olsen KM . et al.  The endogenous GL3 , but not EGL3 , gene is necessary for anthocyanin accumulation as induced by nitrogen depletion in Arabidopsis rosette stage leaves . Planta . 2009 ; 230 : 747 – 54

Lim S , Kim D , Jung J . et al.  Alternative splicing of the basic helix-loop-helix transcription factor gene CmbHLH2 affects anthocyanin biosynthesis in ray florets of chrysanthemum ( Chrysanthemum morifolium ) . Front Plant Sci . 2021 ; 12 :

Song Y , Ma B , Guo Q . et al.  UV-B induces the expression of flavonoid biosynthetic pathways in blueberry ( Vaccinium corymbosum ) calli . Front Plant Sci . 2022 ; 13 :

Li W , Mao J , Yang SJ . et al.  Anthocyanin accumulation correlates with hormones in the fruit skin of 'Red Delicious' and its four generation bud sport mutants . BMC Plant Biol . 2018 ; 18 : 363

Li W , Ning GX , Mao J . et al.  Whole-genome DNA methylation patterns and complex associations with gene expression associated with anthocyanin biosynthesis in apple fruit skin . Planta . 2019 ; 250 : 1833 – 47

Sun J , Lu J , Bai M . et al.  Phytochrome-interacting factors interact with transcription factor CONSTANS to suppress flowering in rose . Plant Physiol . 2021 ; 186 : 1186 – 201

Su L , Zhang Y , Yu S . et al.  RcbHLH59-RcPRs module enhances salinity stress tolerance by balancing Na+/K+ through callose deposition in rose ( Rosa chinensis ) . Hortic Res . 2023 ; 10 :

Liu W , Zhang R , Xiang C . et al.  Transcriptomic and physiological analysis reveal that α-linolenic acid biosynthesis responds to early chilling tolerance in pumpkin rootstock varieties . Front Plant Sci . 2021 ; 12 :

Bradford MM . A rapid and sensitive method for the quantitation of microgram quantities of protein utilizing the principle of protein-dye binding . Anal Biochem . 1976 ; 72 : 248 – 54

Cheng C , Yu Q , Wang Y . et al.  Ethylene-regulated asymmetric growth of the petal base promotes flower opening in rose ( Rosa hybrida ) . Plant Cell . 2021 ; 33 : 1229 – 51

Zhang Y , Wu Z , Feng M . et al.  The circadian-controlled PIF8-BBX28 module regulates petal senescence in rose flowers by governing mitochondrial ROS homeostasis at night . Plant Cell . 2021 ; 33 : 2716 – 35

Meng Y , Zhang H , Fan Y . et al.  Anthocyanins accumulation analysis of correlated genes by metabolome and transcriptome in green and purple peppers ( Capsicum annuum ) . BMC Plant Biol . 2022 ; 22 : 358

Deng H , Wu G , Zhang R . et al.  Comparative nutritional and metabolic analysis reveals the taste variations during yellow rambutan fruit maturation . Food Chem X . 2023 ; 17 :100580

Liu D , Pan Y , Li K . et al.  Proteomics reveals the mechanism underlying the inhibition of Phytophthora sojae by propyl gallate . J Agric Food Chem . 2020 ; 68 : 8151 – 62

Yang B , He S , Liu Y . et al.  Transcriptomics integrated with metabolomics reveals the effect of regulated deficit irrigation on anthocyanin biosynthesis in cabernet sauvignon grape berries . Food Chem . 2020 ; 314 :126170

Umer MJ , Bin Safdar L , Gebremeskel H . et al.  Identification of key gene networks controlling organic acid and sugar metabolism during watermelon fruit development by integrating metabolic phenotypes and gene expression profiles . Hortic Res . 2020 ; 7 : 193

Liang Y , Jiang C , Liu Y . et al.  Auxin regulates sucrose transport to repress petal abscission in rose ( Rosa hybrida ) . Plant Cell . 2020 ; 32 : 3485 – 99

Author notes

Supplementary data, email alerts, citing articles via.

  • International Horticulture Research Conference
  • Advertising & Corporate Services


  • Online ISSN 2052-7276
  • Print ISSN 2662-6810
  • Copyright © 2024 Nanjing Agricultural University
  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Institutional account management
  • Rights and permissions
  • Get help with access
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

What do you think caused your ALS? An analysis of the CDC national amyotrophic lateral sclerosis patient registry qualitative risk factor data using artificial intelligence and qualitative methodology

Exit notification / disclaimer policy.

  • The Centers for Disease Control and Prevention (CDC) cannot attest to the accuracy of a non-federal website.
  • Linking to a non-federal website does not constitute an endorsement by CDC or any of its employees of the sponsors or the information and products presented on the website.
  • You will be subject to the destination website's privacy policy when you follow the link.
  • CDC is not responsible for Section 508 compliance (accessibility) on other federal or private website.

Research on internal quality testing method of dry longan based on terahertz imaging detection technology

  • Original Paper
  • Published: 11 May 2024

Cite this article

analysis in research methodology

  • Jun Hu   ORCID: orcid.org/0000-0003-0027-7993 1 ,
  • Hao Wang 1 ,
  • Yongqi Zhou 1 ,
  • Shimin Yang 1 ,
  • Haohao Lv 1 &
  • Liang Yang 1  

Explore all metrics

Longan is a kind of nut with rich nutritional value and homologous function of medicine and food. The quality of longan directly affects its curative effect, and its fullness is the key index to evaluate its quality. However, the internal information of longan cannot be obtained from the outside. Therefore, rapid non-destructive testing of internal quality of dry longan is of great significance. In this paper, rapid non-destructive testing of longan internal fullness based on terahertz transmission imaging technology was carried out. This study takes longan as the research object. Firstly, the terahertz transmission images of longans with different fullness were collected, and the terahertz spectral signals of different regions of interest were extracted for analysis. Then, three qualitative discriminant models, support vector machine (SVM), Random forest (RF) and linear discriminant analysis (LDA), were established to explore the optimal discriminant model and realize the distinction of different regional categories of longan. Finally, the collected longan terahertz transmission image is processed, and the number of white pixels in the connected domain is calculated by using Otsu threshold segmentation and image inversion. The fullness of longan can be achieved by calculating the ratio of core and pulp to the pixel of the shell. The LDA discriminant model had the best prediction effect. It could not only identify the spectral data of background region, shell region, core region, but also reach 98.57% accuracy for the spectral data of pulp region. The maximum error between the measured fullness and the actual fullness of the terahertz image processed by Otsu threshold segmentation is less than 3.11%. Terahertz imaging technique can realize rapid non-destructive detection of longan fullness and recognition of different regions. This study provides an effective scheme for selecting the quality of longan.

Graphical abstract

analysis in research methodology

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

analysis in research methodology

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

X. Zhang, S. Guo, C.T. Ho et al., Phytochemical constituents and biological activities of longan ( Dimocarpus longan Lour.) fruit: a review. Food Sci. Hum. Wellness 9 (2), 95–102 (2020)

Article   Google Scholar  

H. Lim, J. Lee, S. Lee et al., Low-density foreign body detection in food products using single-shot grid-based dark-field X-ray imaging. J. Food Eng. 335 , 111189 (2022)

Article   CAS   Google Scholar  

D. Mery, I. Lillo, H. Loebel et al., Automated fish bone detection using X-ray imaging. J. Food Eng. 105 (3), 485–492 (2011)

I. Orina, M. Manley, S. Kucheryavskiy et al., Application of image texture analysis for evaluation of X-ray images of fungal-infected maize kernels. Food Anal. Methods 11 , 2799–2815 (2018)

X. Cheng, M. Zhang, B. Adhikari et al., Effect of power ultrasound and pulsed vacuum treatments on the dehydration kinetics, distribution, and status of water in osmotically dehydrated strawberry: a combined NMR and DSC study. Food Bioprocess Technol. 7 , 2782–2792 (2014)

S. Baek, J. Lim, J.G. Lee et al., Investigation of the maturity changes of cherry tomato using magnetic resonance imaging. Appl. Sci. 10 (15), 5188 (2020)

M.S. Razavi, A. Asghari, M. Azadbakh et al., Analyzing the pear bruised volume after static loading by magnetic resonance imaging (MRI). Sci. Hortic. 229 , 33–39 (2018)

F. Wang, C. Zhao, H. Yang et al., Non-destructive and in-site estimation of apple quality and maturity by hyperspectral imaging. Comput. Electron. Agric. 195 , 106843 (2022)

P. Zhang, H. Wang, H. Ji et al., Hyperspectral imaging-based early damage degree representation of apple: a method of correlation coefficient. Postharvest Biol. Technol. 199 , 112309 (2023)

J. Sun, K. Tang, X. Wu et al., Nondestructive identification of green tea varieties based on hyperspectral imaging technology. J. Food Process Eng 41 (5), e12800 (2018)

SY M Y. Terahertz Pulsed Imaging in Reflection Geometry[D]. The Chinese University of Hong Kong, 2011.

Z. Zhu, J. Zhang, Y. Song et al., Broadband terahertz signatures and vibrations of dopamine. Analyst 145 (18), 6006–6013 (2020)

Article   CAS   PubMed   Google Scholar  

J. El Haddad, F. de Miollis, J. Bou Sleiman et al., Chemometrics applied to quantitative analysis of ternary mixtures by terahertz spectroscopy. Anal. Chem. 86 (10), 4927–4933 (2014)

Article   PubMed   Google Scholar  

C. Wang, R. Zhou, Y. Huang et al., Terahertz spectroscopic imaging with discriminant analysis for detecting foreign materials among sausages. Food Control 97 , 100–104 (2019)

Y. Shen, Y. Yin, B. Li et al., Detection of impurities in wheat using terahertz spectral imaging and convolutional neural networks. Comput. Electron. Agric. 181 , 105931 (2021)

X. Sun, J. Li, Y. Shen et al., Non-destructive detection of insect foreign bodies in finishing tea product based on terahertz spectrum and image. Front. Nutr. 8 , 757491 (2021)

Article   PubMed   PubMed Central   Google Scholar  

J. Hu, H. Shi, C. Zhan et al., Study on the identification and detection of walnut quality based on terahrtz imaging. Foods 11 (21), 3498 (2022)

B. Li, D. Zhang, Y. Shen, Study on terahertz spectrum analysis and recognition modeling of common agricultural diseases. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 243 , 118820 (2020)

W. Liu, C. Liu, X. Hu et al., Application of terahertz spectroscopy imaging for discrimination of transgenic rice seeds with chemometrics. Food Chem. 210 , 415–421 (2016)

X. Sun, J. Liu, Measurement of plumpness for intact sunflower seed using Terahertz transmittance imaging. J. Infrared Millimeter Terahertz Waves 41 (3), 307–321 (2020)

D.J. Jwo, W.Y. Chang, I.H. Wu, Windowing techniques, the welch method for improvement of power spectrum estimation. Comput. Mater. Contin 67 , 3983–4003 (2021)

Google Scholar  

S. Pan, H. Zhang, Z. Li et al., Classification of Ginseng with different growth ages based on terahertz spectroscopy and machine learning algorithm. Optik 236 , 166322 (2021)

X. Wei, D. Kong, S. Zhu et al., Rapid identification of soybean varieties by terahertz frequency-domain spectroscopy and grey wolf optimizer-support vector machine. Front. Plant Sci. 13 , 823865 (2022)

H. Zhang, Z. Li, T. Chen et al., Discrimination of traditional herbal medicines based on terahertz spectroscopy. Optik 138 , 95–102 (2017)

Y. Cao, J. Chen, G. Zhang et al., Characterization and discrimination of human colorectal cancer cells using terahertz spectroscopy. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 256 , 119713 (2021)

S. Yang, C. Li, Y. Mei et al., Determination of the geographical origin of coffee beans using terahertz spectroscopy combined with machine learning methods. Front. Nutr. 8 , 680627 (2021)

J. Liu, H. Xie, B. Zha et al., Detection of genetically modified sugarcane by using terahertz spectroscopy and chemometrics. J. Appl. Spectrosc. 85 , 119–125 (2018)

J. Li, W. Luo, Z. Wang et al., Early detection of decay on apples using hyperspectral reflectance imaging combining both principal component analysis and improved watershed segmentation method. Postharvest Biol. Technol. 149 , 235–246 (2019)

J. Hu, P. Qiao, L. Yang et al., Research on nondestructive detection of pine nut quality based on terahertz imaging. Infrared Phys. Technol. 134 , 104798 (2023)

Wang Y T, Li Q. A segmentation algorithm on terahertz digital holographic image[C]//14th National Conference on Laser Technology and Optoelectronics (LTO 2019). SPIE, 2019, 11170: 310–315.

H. Li, J.X. Wang, Z.N. Xing et al., Influence of improved Kennard/Stone algorithm on the calibration transfer in near-infrared spectroscopy. Spectrosc Spect Anal 31 (2), 362–365 (2011)

Download references


National Youth Natural Science Foundation of China (32302261); Jiangxi Ganpo Talented Support Plan -Young science and technology talent Lift Project (2023QT04); Jiangxi Provincial Youth Science Fund Project (20224BAB215042); National Key R&D Program of China (2022YFD2001805).

Author information

Authors and affiliations.

School of Mechatronics & Vehicle Engineering, East China Jiaotong University, Nanchang, 330013, Jiangxi, China

Jun Hu, Hao Wang, Yongqi Zhou, Shimin Yang, Haohao Lv & Liang Yang

You can also search for this author in PubMed   Google Scholar


Jun Hu: Investigation, Writing-review and editing, Experimental scheme design, Formal analysis. Hao Wang: Writing-original draft, Formal analysis. Yongqi Zhou: Experiment. Shimin Yang, Haohao Lv: Review and editing. Liang Yang: Formal analysis.

Corresponding author

Correspondence to Jun Hu .

Ethics declarations

Competing interests.

We declare that we have no financial and personal relationships with other people or organizations that can inappropriately influence our work, there is no professional or other personal interest of any nature or kind in any product, service and/or company that could be construed as influencing the position presented in, or the review of, the manuscript entitled. Jun Hu, Hao Wang, Yongqi Zhou, Shimin Yang, Haohao Lv, Liang Yang declare that they have no conflict of interest.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Hu, J., Wang, H., Zhou, Y. et al. Research on internal quality testing method of dry longan based on terahertz imaging detection technology. Food Measure (2024). https://doi.org/10.1007/s11694-024-02583-x

Download citation

Received : 29 December 2023

Accepted : 18 April 2024

Published : 11 May 2024

DOI : https://doi.org/10.1007/s11694-024-02583-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Longan fullness
  • Terahertz imaging technology
  • Otsu threshold segmentation
  • Linear discriminant analysis (LDA)
  • Find a journal
  • Publish with us
  • Track your research


  1. Data Analysis in Research: Types & Methods

    Definition of research in data analysis: According to LeCompte and Schensul, research data analysis is a process used by researchers to reduce data to a story and interpret it to derive insights. The data analysis process helps reduce a large chunk of data into smaller fragments, which makes sense. Three essential things occur during the data ...

  2. What Is a Research Methodology?

    Your research methodology discusses and explains the data collection and analysis methods you used in your research. A key part of your thesis, dissertation, or research paper, the methodology chapter explains what you did and how you did it, allowing readers to evaluate the reliability and validity of your research and your dissertation topic.

  3. Research Methods

    Qualitative analysis tends to be quite flexible and relies on the researcher's judgement, so you have to reflect carefully on your choices and assumptions and be careful to avoid research bias. Quantitative analysis methods. Quantitative analysis uses numbers and statistics to understand frequencies, averages and correlations (in descriptive ...

  4. Learning to Do Qualitative Data Analysis: A Starting Point

    For many researchers unfamiliar with qualitative research, determining how to conduct qualitative analyses is often quite challenging. Part of this challenge is due to the seemingly limitless approaches that a qualitative researcher might leverage, as well as simply learning to think like a qualitative researcher when analyzing data. From framework analysis (Ritchie & Spencer, 1994) to content ...

  5. A tutorial on methodological studies: the what, when, how and why

    Methodological studies - studies that evaluate the design, analysis or reporting of other research-related reports - play an important role in health research. They help to highlight issues in the conduct of research with the aim of improving health research methodology, and ultimately reducing research waste.

  6. Data Analysis

    Data Analysis. Definition: Data analysis refers to the process of inspecting, cleaning, transforming, and modeling data with the goal of discovering useful information, drawing conclusions, and supporting decision-making. It involves applying various statistical and computational techniques to interpret and derive insights from large datasets.

  7. A Step-by-Step Process of Thematic Analysis to Develop a Conceptual

    Thematic analysis is a research method used to identify and interpret patterns or themes in a data set; it often leads to new insights and understanding (Boyatzis, 1998; Elliott, 2018; Thomas, 2006).However, it is critical that researchers avoid letting their own preconceptions interfere with the identification of key themes (Morse & Mitcham, 2002; Patton, 2015).

  8. Data analysis

    Recent News. data analysis, the process of systematically collecting, cleaning, transforming, describing, modeling, and interpreting data, generally employing statistical techniques. Data analysis is an important part of both scientific research and business, where demand has grown in recent years for data-driven decision making.

  9. The Beginner's Guide to Statistical Analysis

    Statistical analysis means investigating trends, patterns, and relationships using quantitative data. It is an important research tool used by scientists, governments, businesses, and other organizations. ... Your choice of statistical test depends on your research questions, research design, sampling method, and data characteristics ...

  10. A Comprehensive Guide to Methodology in Research

    Research methodology refers to the system of procedures, techniques, and tools used to carry out a research study. It encompasses the overall approach, including the research design, data collection methods, data analysis techniques, and the interpretation of findings. Research methodology plays a crucial role in the field of research, as it ...

  11. A tutorial on methodological studies: the what, when, how and why

    Background Methodological studies - studies that evaluate the design, analysis or reporting of other research-related reports - play an important role in health research. They help to highlight issues in the conduct of research with the aim of improving health research methodology, and ultimately reducing research waste. Main body We provide an overview of some of the key aspects of ...

  12. Research Methodology

    Qualitative Research Methodology. This is a research methodology that involves the collection and analysis of non-numerical data such as words, images, and observations. This type of research is often used to explore complex phenomena, to gain an in-depth understanding of a particular topic, and to generate hypotheses.

  13. What Is Research Methodology? Definition + Examples

    Learn what research methodology means, how to choose between qualitative, quantitative and mixed-methods, and how to select data collection and analysis methods. Find out the difference between research design and methodology, and see examples of research methodologies with videos and tips.

  14. Units of Analysis and Methodologies for Qualitative Studies

    Units of Analysis and Methodologies for Qualitative Studies. By Janet Salmons, PhD Manager, Sage Research Methods Community. Selecting the methodology is an essential piece of research design. This post is excerpted and adapted from Chapter 2 of Doing Qualitative Research Online (2022). Use the code COMMUNITY3 for a 20% discount on the book ...

  15. How to conduct a meta-analysis in eight steps: a practical guide

    2.1 Step 1: defining the research question. The first step in conducting a meta-analysis, as with any other empirical study, is the definition of the research question. Most importantly, the research question determines the realm of constructs to be considered or the type of interventions whose effects shall be analyzed.

  16. How to use and assess qualitative research methods

    How to conduct qualitative research? Given that qualitative research is characterised by flexibility, openness and responsivity to context, the steps of data collection and analysis are not as separate and consecutive as they tend to be in quantitative research [13, 14].As Fossey puts it: "sampling, data collection, analysis and interpretation are related to each other in a cyclical ...

  17. What is data analysis? Methods, techniques, types & how-to

    A method of data analysis that is the umbrella term for engineering metrics and insights for additional value, direction, and context. By using exploratory statistical evaluation, data mining aims to identify dependencies, relations, patterns, and trends to generate advanced knowledge.

  18. Basic statistical tools in research and data analysis

    Statistical methods involved in carrying out a study include planning, designing, collecting data, analysing, drawing meaningful interpretation and reporting of the research findings. The statistical analysis gives meaning to the meaningless numbers, thereby breathing life into a lifeless data. The results and inferences are precise only if ...

  19. Regression Analysis

    Regression analysis is a quantitative research method which is used when the study involves modelling and analysing several variables, where the relationship includes a dependent variable and one or more independent variables. In simple terms, regression analysis is a quantitative method used to test the nature of relationships between a dependent variable and one or more independent variables.

  20. Content Analysis

    Content analysis is a research method used to identify patterns in recorded communication. To conduct content analysis, you systematically collect data from a set of texts, which can be written, oral, or visual: Books, newspapers and magazines. Speeches and interviews. Web content and social media posts. Photographs and films.

  21. New Content From Advances in Methods and Practices in Psychological

    Participants included 103 research-methods instructors, academics, students, and nonacademic psychologists. Of 78 items included in the consensus process, 34 reached consensus. We coupled these results with a qualitative analysis of 707 open-ended text responses to develop nine recommendations for organizations that accredit undergraduate ...

  22. Principal stratification methods and software for intercurrent events

    This project aims to develop a suite of transparent and accessible analysis tools, software and educational material for applying the principal stratification method to analyze intercurrent events ...

  23. Determining Methodology Through Research Questions

    Selecting the right methodology is a direct response to your research question. If your question is broad and exploratory, qualitative methods like interviews or focus groups might be the best fit.

  24. Methodology

    The analysis in this report is based on a self-administered web survey conducted from Sept. 26 to Oct. 23, 2023, among a sample of 1,453 dyads, with each ... sampling methodology. In 2009, Ipsos migrated to an address-based sampling (ABS) recruitment methodology via the U.S. Postal Service's Delivery Sequence File (DSF). ... demographic ...

  25. Multi-omics analysis reveals key regulatory defense pathways and genes

    This research facilitates our understanding of the regulatory mechanisms of plant development and secondary metabolites underlying salt stress responses in rose, offering valuable insights that could be used to develop new strategies for improving plant tolerance to salinity. Materials and methods Plant materials and growth conditions

  26. Introduction to systematic review and meta-analysis

    It is easy to confuse systematic reviews and meta-analyses. A systematic review is an objective, reproducible method to find answers to a certain research question, by collecting all available studies related to that question and reviewing and analyzing their results. A meta-analysis differs from a systematic review in that it uses statistical ...

  27. What do you think caused your ALS? An analysis of the CDC national

    Links with this icon indicate that you are leaving the CDC website.. The Centers for Disease Control and Prevention (CDC) cannot attest to the accuracy of a non-federal website. Linking to a non-federal website does not constitute an endorsement by CDC or any of its employees of the sponsors or the information and products presented on the website.

  28. How to Do Thematic Analysis

    Like all academic texts, writing up a thematic analysis requires an introduction to establish our research question, aims and approach. We should also include a methodology section, describing how we collected the data (e.g. through semi-structured interviews or open-ended survey questions ) and explaining how we conducted the thematic analysis ...

  29. Research on internal quality testing method of dry longan ...

    Linear Discriminant Analysis (LDA) [26, 27] is a supervised dimensionality reduction method, and the classification data output by LDA algorithm can achieve the purpose of data qualitative. The principle of linear discriminant analysis algorithm is to project the original data into a space with lower dimensions, and the projected points will ...

  30. Rubber Market Size & Trends

    Rubber Market Size & Trends . The global rubber market size was valued at USD 46.95 billion in 2023 and is expected to expand at a compound annual growth rate (CAGR) of 5.08% from 2024 to 2030. The market has been significantly driven by superior properties exhibited by product such as abrasion and heat resistance and its applicability as a valuable raw material in various end-use industries ...