MEC-109: Research Methods in Economics - Important Concepts for Exams

    Focus on these key topics that are frequently covered in exams and assignments for MAEC.

    Mind Map
    Expand this Mind mapping to see course content. Click on this icon whimsical logo to see it the mindmap in full screen and interact with it.

    Loading mind map...

    Most Important Topics
    High Priority
    Core principles that are asked in exams almost every year

    Key Topics:

    • Critical theory paradigm. Difference with interpretative paradigm

    • Regression models. Goodness of fit. Assumption of classical linear regression model

    • Gini index, Atkinson index, sen

    • Composite index

    • Diff between inductive and HD model.

    • Science as knowledge seeking activity seeking objective, rational and progressive

    • SEM

    • Cluster analysis

    • Factor analysis

    • Research perspectives help in social sciences

    Topics in Detail

    Composite Index

    1. Meaning of Composite Index

    A Composite Index is a single summary measure constructed by combining multiple individual indicators to represent a multidimensional concept that cannot be captured by a single variable.

    Examples

    • Human Development Index (HDI)

    • Multidimensional Poverty Index (MPI)

    • Consumer Price Index (CPI)

    Composite indices are widely used in economic, social, and development research for comparison across regions or over time.

    2. Need for Composite Index

    • Captures multidimensional phenomena

    • Simplifies complex information

    • Enables ranking and comparison

    • Useful for policy formulation and evaluation

    Steps in the Construction of a Composite Index

    The construction of a composite index involves systematic and transparent steps to ensure reliability and validity.

    Step 1: Conceptual Framework and Objective Definition

    The first step is to clearly define:

    • What is being measured

    • Why the index is required

    • The dimensions involved

    Example:
    For HDI, the concept of human development includes health, education, and income.

    Step 2: Selection of Indicators

    Relevant indicators are selected for each dimension. Indicators should be:

    • Relevant

    • Measurable

    • Reliable

    • Comparable across units

    Example (HDI):

    • Health → Life expectancy at birth

    • Education → Mean years of schooling, Expected years of schooling

    • Income → GNI per capita

    Step 3: Data Collection

    Data are collected from:

    • Census

    • Surveys

    • Administrative records

    • National and international databases

    The data should be consistent across time and space.

    Step 4: Treatment of Missing Values

    Missing data must be handled carefully using methods such as:

    • Mean substitution

    • Interpolation

    • Deletion of indicators (if unavoidable)

    Improper handling can distort the index.

    Step 5: Normalization (Standardization) of Data

    Since indicators are measured in different units, they must be converted to a common scale.

    Common normalization methods:

    • Min–Max normalization

    • Z-score standardization

    Min–Max formula:

    X∗=X−Xmin⁡Xmax⁡−Xmin⁡X^* = rac{X - X{min}}{X{max} - X_{min}}X∗=Xmax​−Xmin​X−Xmin​​

    Step 6: Direction of Indicators

    Indicators may be:

    • Positive (higher value = better outcome, e.g., income)

    • Negative (higher value = worse outcome, e.g., infant mortality)

    Negative indicators are transformed so that higher values always indicate better performance.

    Step 7: Weight Assignment

    Weights reflect the relative importance of indicators.

    Methods of weighting:

    • Equal weighting

    • Expert judgment

    • Statistical methods (e.g., PCA)

    Example:
    HDI assigns equal weight to its three dimensions.

    Step 8: Aggregation of Indicators

    Normalized and weighted indicators are combined to form the composite index.

    Methods:

    • Arithmetic mean

    • Geometric mean

    HDI uses the geometric mean to reduce substitutability among dimensions.

    Step 9: Sensitivity and Robustness Analysis

    This step checks:

    • Sensitivity to choice of indicators

    • Sensitivity to weights and aggregation method

    Ensures reliability of the index.

    Step 10: Interpretation and Validation

    The index values are interpreted, compared, and validated using:

    • Rankings

    • Time-series comparison

    • Cross-country or inter-regional analysis

    Advantages of Composite Index

    • Captures multidimensionality

    • Simplifies complex data

    • Facilitates comparison

    • Useful for policy formulation

    Limitations of Composite Index

    • Subjectivity in indicator and weight selection

    • Data availability constraints

    • Risk of oversimplification

    • Sensitivity to methodology

    3. Dealing with Missing Values in Composite Index Construction

    Missing values can distort index values and rankings. Hence, careful treatment is required.

    (a) Deletion Methods

    1. Listwise Deletion

    • Remove observations with missing values

    • Simple but reduces sample size

    • Suitable only when missing data are minimal and random

    2. Pairwise Deletion

    • Uses all available data for each indicator

    • May create inconsistencies across components

    (b) Imputation Methods

    1. Mean/Median Imputation

    • Replace missing values with mean or median

    • Easy to implement

    • Reduces variability and may bias results

    2. Regression Imputation

    • Predict missing values using other variables

    • More accurate but model-dependent

    3. Multiple Imputation

    • Generates several plausible values

    • Accounts for uncertainty

    • Considered statistically robust

    (c) Indicator Adjustment Methods

    1. Re-weighting

    • Adjust weights of remaining indicators

    • Prevents penalising units with missing data

    2. Normalisation-Based Substitution

    • Use regional or group averages after standardisation

    (d) Threshold-Based Exclusion

    • Exclude indicators or units if missing data exceed a defined limit

    • Maintains reliability of the index

    4. Best Practices

    • Analyse the pattern of missingness

    • Use transparent and consistent methods

    • Conduct sensitivity analysis

    • Clearly report assumptions and methods

    Sources of Financial Data for Money and Capital Markets

    Researchers rely on primary and secondary sources of financial data to identify research problems, formulate hypotheses, and conduct empirical analysis.

    1. Sources of Financial Data for the Money Market

    The money market deals with short-term funds and highly liquid instruments such as treasury bills, call money, commercial paper, and certificates of deposit.

    (a) Central Bank Publications

    The most important source of money market data is the Reserve Bank of India (RBI).

    Key RBI publications include:

    • Handbook of Statistics on the Indian Economy

    • RBI Bulletin

    • Annual Report of RBI

    • Database on Indian Economy (DBIE)

    These provide data on:

    • Call money rates

    • Treasury bill yields

    • Repo and reverse repo rates

    • Money supply (M1, M2, M3)

    • Liquidity conditions

    (b) Government Sources

    • Ministry of Finance

    • Controller General of Accounts
      Data includes:

    • Short-term government borrowings

    • Treasury bill auctions

    • Fiscal deficit financing

    (c) Financial Institutions and Banks

    • Commercial banks

    • Primary dealers
      They publish data on:

    • Interbank lending

    • Deposit and lending rates

    • Credit growth

    (d) International Sources

    • International Monetary Fund (IMF)

    • World Bank

    These provide comparable cross-country money market indicators such as:

    • Interest rates

    • Inflation

    • Monetary aggregates

    2. Sources of Financial Data for the Capital Market

    The capital market deals with long-term financial instruments such as shares, debentures, and bonds.

    (a) Stock Exchanges

    Major stock exchanges are:

    • Bombay Stock Exchange (BSE)

    • National Stock Exchange (NSE)

    They provide:

    • Share prices and indices

    • Trading volume

    • Market capitalisation

    • Volatility measures

    (b) Market Regulator

    The Securities and Exchange Board of India (SEBI) publishes:

    • Market surveillance reports

    • Investor participation data

    • Mutual fund statistics

    • Corporate governance data

    (c) Corporate Financial Statements

    • Annual reports

    • Balance sheets

    • Profit and loss accounts
      These help analyse:

    • Firm performance

    • Capital structure

    • Dividend policy

    (d) Financial Databases and Research Institutions

    • CMIE (Centre for Monitoring Indian Economy)

    • Stock market databases (firm-level and sectoral data)

    (e) International Capital Market Data

    • World Federation of Exchanges

    • IMF’s Global Financial Stability Reports
      Used for comparative and global market analysis.

    3. Role of Financial Data in Identifying Research Issues

    Financial data helps researchers to identify and refine research problems in the following ways:

    • Rising interest rate volatility may indicate liquidity stress.

    • Stock market booms or crashes suggest speculative bubbles or structural changes.

    (b) Testing Economic Theories

    • Interest rate data helps test monetary transmission mechanisms.

    • Stock price behaviour helps test the Efficient Market Hypothesis.

    (c) Identifying Market Imperfections

    • Credit rationing

    • Information asymmetry

    • Excess volatility
      These issues become visible through empirical financial data.

    (d) Policy Evaluation

    • Impact of monetary policy changes

    • Effects of financial reforms and deregulation

    • Performance of capital market regulations

    5. Role of Financial Data in Analysis

    Financial data enables quantitative and econometric analysis, such as:

    (a) Econometric Modelling

    • Regression analysis

    • Time-series analysis

    • Volatility models (ARCH/GARCH)

    (b) Risk and Return Analysis

    • Portfolio analysis

    • Asset pricing models (CAPM)

    • Credit risk assessment

    (c) Forecasting

    • Interest rate forecasting

    • Stock market trend prediction

    • Business cycle analysis

    (d) Cross-Country Comparisons

    • Financial depth

    • Market integration

    • Capital flows and stability

    2. Application of Multistage Sampling to Study Poverty Incidence in a Village

    Meaning of Multistage Sampling

    Multistage sampling is a probability sampling technique in which the sample is selected in successive stages, using smaller and smaller sampling units at each stage. Instead of selecting final units (households) directly, selection is done step-by-step.

    This method is especially useful when:

    • The population is large and scattered

    • A complete list of households is not readily available

    • Time and cost constraints exist

    Applying Multistage Sampling to Measure Poverty in a Village

    Suppose a researcher wants to estimate the incidence of poverty among households in a village or group of villages.

    Stage 1: Selection of Region / District

    • From a state, a district is selected randomly or purposively (depending on research design).

    • Example: Select one backward district for poverty analysis.

    Stage 2: Selection of Villages

    • From the selected district, a sample of villages is drawn using simple random sampling or probability proportional to size (PPS).

    • Larger villages may have higher probability of selection.

    Stage 3: Selection of Households

    • From each selected village, a list of households is prepared.

    • Households are then selected randomly or systematically.

    Stage 4: Data Collection

    • Information is collected on:

      • Income or consumption expenditure

      • Employment

      • Household size

    • Poverty is assessed using a poverty line (income or consumption-based).

    Advantages of Multistage Sampling in Poverty Studies

    • Cost-effective and feasible

    • Suitable for rural and scattered populations

    • Flexible at different stages

    • Reduces fieldwork burden

    Limitations

    • Sampling error may accumulate at each stage

    • More complex than single-stage sampling

    • Requires careful design and execution

    Comparison of Operational Procedure: Stratified Sampling vs Multistage Sampling

    Meaning of Stratified Sampling

    In stratified sampling, the population is first divided into homogeneous sub-groups (strata) based on specific characteristics (e.g., income groups, caste, gender), and then samples are drawn from each stratum.

    Comparison Table

    Basis of Comparison Stratified Sampling Multistage Sampling
    Basic Principle Population divided into strata Sampling done in stages
    Nature of Groups Homogeneous within strata Heterogeneous units at each stage
    Sampling Stages Single-stage after stratification Two or more stages
    Sampling Frame Required for entire population Required only for each stage
    Selection of Units Direct selection of final units Indirect, step-by-step selection
    Cost and Time More costly for large populations Relatively economical
    Precision High precision Slightly lower precision
    Usefulness Small, well-defined populations Large, geographically dispersed populations
    Example Sampling poor & non-poor households separately District → Village → Household

    Operational Differences Explained

    • Stratified Sampling ensures representation of all important sub-groups and improves accuracy, but requires complete prior information.

    • Multistage Sampling is operationally simpler for large-scale field surveys but may involve higher sampling error.

    Research Methods and Research Methodology

    Distinction between Research Methods and Research Methodology

    (a) Meaning of Research Methods

    Research methods refer to the specific techniques, tools, and procedures used for collecting and analysing data in a research study. They answer the question “How is the research carried out?”

    Examples:

    • Surveys

    • Interviews

    • Observation

    • Statistical analysis

    • Regression techniques

    (b) Meaning of Research Methodology

    Research methodology refers to the overall philosophical framework and logic that guides the selection and use of research methods. It explains why certain methods are used and how the research is systematically designed.

    It includes:

    • Research philosophy

    • Assumptions about reality (ontology)

    • Nature of knowledge (epistemology)

    • Research strategy and design

    (c) Differences between Research Methods and Research Methodology

    Basis Research Methods Research Methodology
    Meaning Techniques of data collection and analysis Philosophical framework guiding research
    Scope Narrow Broad
    Focus Practical and operational Theoretical and conceptual
    Concerned with Data, tools, procedures Logic, assumptions, justification
    Level Micro-level Macro-level
    Example Interview, questionnaire Positivism, interpretivism

    (d) Relationship between the Two

    Research methodology determines which research methods are appropriate. Thus, methods are embedded within a methodology and cannot be chosen independently of it.

    Formulation of a Research Proposal

    Step 1: Identification of the Research Problem

    • Selection of a clear, specific, and researchable problem

    • Must be relevant to theory, policy, or practice

    • Avoids vague or overly broad topics

    Example:
    “Determinants of rural female labour force participation in India”

    Step 2: Review of Literature

    • Study of existing research, theories, and findings

    • Helps to:

      • Identify research gaps

      • Avoid duplication

      • Refine research questions

    Step 3: Statement of Research Objectives and Questions

    • Clearly states what the study aims to achieve

    • Objectives should be:

      • Specific

      • Measurable

      • Achievable

    Example

    • To analyse the impact of education on labour participation

    • To examine regional variations

    Step 4: Formulation of Hypotheses (if applicable)

    • Tentative statements to be tested empirically

    • Common in quantitative research

    Example

    H₁: Education has a positive impact on female labour force participation.

    Step 5: Research Design

    • Overall blueprint of the study

    • Includes:

      • Type of study (exploratory, descriptive, causal)

      • Time dimension (cross-sectional or longitudinal)

    Step 6: Data Sources and Data Collection Methods

    • Primary data: surveys, interviews, observations

    • Secondary data: Census, NSS, NFHS, RBI data

    Justification of chosen method is essential.

    Step 7: Sampling Design

    • Definition of:

      • Target population

      • Sample size

      • Sampling technique (random, stratified, purposive)

    Step 8: Tools and Techniques of Analysis

    • Statistical and econometric tools to be used

    • Examples:

      • Descriptive statistics

      • Regression models

      • Index numbers

    Step 9: Scope and Limitations of the Study

    • Defines boundaries of the research

    • Acknowledges constraints such as:

      • Time

      • Data availability

      • Methodological limitations

    Step 10: Ethical Considerations

    • Confidentiality of data

    • Informed consent

    • Avoidance of plagiarism and bias

    Step 11: Expected Contribution of the Study

    • Academic contribution

    • Policy relevance

    • Practical implications

    Step 12: Time Schedule and Budget (if required)

    • Work plan with timelines

    • Financial requirements for data collection and analysis

    Methodologies Used in Interpretive Research

    Interpretive research employs several methodologies aimed at understanding subjective meanings and social processes.

    (a) Phenomenological Methodology

    • Focuses on individuals’ lived experiences

    • Seeks to understand how people perceive and interpret economic realities

    • Used in studies of poverty, unemployment, and informal labour

    Example:
    Understanding how households experience poverty rather than merely measuring income levels.

    (b) Ethnographic Methodology

    • Involves long-term immersion in a social setting

    • Uses participant observation and informal interviews

    • Common in rural development and informal sector studies

    Example:
    Studying work culture and survival strategies of street vendors.

    (c) Hermeneutic Methodology

    • Concerned with interpretation of texts and narratives

    • Includes policy documents, interviews, and historical records

    • Emphasises context and historical background

    Example:
    Interpreting development policy documents to understand underlying assumptions about growth and welfare.

    (d) Case Study Methodology

    • In-depth study of a single case or small number of cases

    • Useful for complex economic and institutional processes

    • Allows contextual understanding

    Example:
    A detailed study of one self-help group or cooperative society.

    (e) Narrative and Discourse Analysis

    • Focuses on language, stories, and communication

    • Examines how economic realities are constructed through discourse

    Example:
    Analysing how poverty is framed in government reports versus community narratives.

    5. Relevance of Interpretive Methodologies in Economics

    Interpretive methodologies are particularly useful when:

    • Quantitative data fails to capture social realities

    • Human behaviour, institutions, and culture are central

    • Policy evaluation requires understanding stakeholder perspectives

    They complement quantitative methods by providing depth, context, and meaning.

    Action Research and Its Application to Reducing Malnourishment among Adolescent Students

    1. Meaning of Action Research

    Action Research is a participatory and problem-oriented research approach in which the researcher actively intervenes in a real-life situation to bring about change while simultaneously generating knowledge. It combines action, reflection, and research in a cyclical process.

    Key features:

    • Focus on practical problem-solving

    • Conducted in real social settings

    • Involves stakeholders (teachers, students, parents, community)

    • Cyclical process: Plan → Act → Observe → Reflect

    2. Objectives of Action Research in the Given Context

    The objective of the proposed action research is to:

    • Identify the extent and causes of malnourishment among adolescent students

    • Design and implement context-specific interventions

    • Evaluate outcomes and improve nutritional status

    The study is conducted in a rural school in Gujarat, where adolescent malnutrition may be linked to poverty, dietary habits, and lack of awareness.

    Key Characteristics of Action Research and Their Role in Local-Level Change

    1. Participatory Nature

    Action research actively involves community members in:

    • Problem identification

    • Data collection

    • Decision-making

    This ensures that the voices of disadvantaged groups are heard and valued, leading to solutions rooted in lived experiences.

    2. Problem-Centred and Context-Specific

    Unlike abstract research, action research focuses on real, local problems, such as:

    • Malnutrition

    • Poor sanitation

    • Low school attendance

    This local relevance increases the effectiveness and acceptance of interventions.

    3. Empowerment-Oriented

    Participation builds:

    • Awareness

    • Confidence

    • Collective agency

    Disadvantaged groups move from being subjects of research to agents of change, strengthening social inclusion.

    4. Cyclical and Flexible Process

    Action research follows a cycle of:

    • Planning

    • Action

    • Observation

    • Reflection

    This allows continuous learning and correction, which is crucial in complex social settings.

    5. Immediate Application of Findings

    Findings are not delayed for academic publication but are translated directly into action, making it suitable for urgent local issues.

    6. Democratic and Inclusive

    It reduces power asymmetries between researchers and participants by promoting:

    • Dialogue

    • Mutual learning

    • Collective ownership

    This is especially important for historically excluded groups.

    7. Capacity Building

    Action research enhances local capabilities by developing:

    • Problem-solving skills

    • Leadership

    • Organisational capacity

    This ensures sustainability beyond the research period.

    3. Steps in Action Research

    Action research proceeds through systematic stages, as explained below.

    4. Steps for Data Collection

    (a) Identifying the Problem

    • Preliminary discussions with teachers and health workers

    • Review of school health records

    • Identification of symptoms such as low BMI, fatigue, absenteeism

    (b) Data Collection Methods

    (i) Primary Data

    • Anthropometric measurements: height, weight, BMI

    • Structured questionnaires for students on:

      • Dietary intake

      • Meal frequency

      • Awareness of nutrition

    • Interviews with:

      • Parents

      • Teachers

      • Anganwadi / health workers

    • Observation of:

      • Mid-day meal quality

      • Hygiene practices

    (ii) Secondary Data

    • School health registers

    • ICDS and NFHS reports

    • Government nutrition programme guidelines

    5. Steps for Data Analysis

    (a) Quantitative Analysis

    • Classification of students as undernourished, normal, or overweight using BMI-for-age

    • Frequency and percentage analysis of:

      • Nutrient intake

      • Meal skipping

      • Anaemia symptoms

    (b) Qualitative Analysis

    • Thematic analysis of interviews

    • Identification of key causes:

      • Inadequate diet

      • Cultural food practices

      • Economic constraints

      • Lack of nutrition awareness

    6. Developing and Implementing Action Plans

    Based on findings, targeted interventions are designed.

    (a) Nutritional Interventions

    • Improvement in mid-day meal quality

    • Inclusion of:

      • Pulses, eggs, milk, green vegetables
    • Coordination with local health departments

    (b) Awareness and Behavioural Interventions

    • Nutrition education sessions for students

    • Workshops for parents on balanced diets

    • Posters and charts on adolescent nutrition

    (c) Health Interventions

    • Regular health check-ups

    • Iron and folic acid supplementation

    • Deworming programmes

    7. Observation and Evaluation

    • Monitoring changes in:

      • BMI and weight

      • Attendance and participation

    • Feedback from students and teachers

    • Comparison of pre- and post-intervention data

    8. Reflection and Follow-up

    • Evaluation of effectiveness of actions

    • Identification of gaps and improvements

    • Modification of strategies if required

    • Planning the next action research cycle

    9. Advantages of Action Research in This Context

    • Directly addresses a real social problem

    • Encourages community participation

    • Produces immediate and usable outcomes

    • Enhances policy and programme effectiveness

    Difference between Research Design and Research Methods

    (a) Research Design

    Research design refers to the overall plan or blueprint of a research study. It specifies what type of study is to be conducted, how data will be collected, and how analysis will be carried out.

    It answers the question: “What is the overall strategy of the research?”

    Examples:

    • Exploratory research design

    • Descriptive research design

    • Experimental research design

    (b) Research Methods

    Research methods are the specific techniques or procedures used for data collection and analysis within the framework of a research design.

    They answer the question: “How will data be collected and analysed?”

    Examples:

    • Survey method

    • Interview method

    • Statistical analysis

    (c) Differences between Research Design and Research Methods

    Basis Research Design Research Methods
    Meaning Overall research plan Techniques used in research
    Nature Conceptual and strategic Operational and practical
    Scope Broad Narrow
    Focus Structure of the study Execution of the study
    Sequence Decided first Follow the design

    Methods of Univariate Data Analysis

    Meaning of Univariate Data Analysis

    Univariate data analysis refers to the analysis of a single variable at a time. Its main objective is to describe, summarise, and understand the distribution of that variable.

    Methods of Univariate Data Analysis

    (a) Frequency Distribution

    • Data is arranged into classes or categories

    • Shows how often each value occurs

    • Helps identify concentration and spread

    (b) Measures of Central Tendency

    These indicate the central or typical value of the data.

    • Mean – arithmetic average

    • Median – middle value

    • Mode – most frequently occurring value

    (c) Measures of Dispersion

    These measure the spread or variability of data.

    • Range

    • Variance

    • Standard deviation

    (d) Graphical Presentation

    Used for visual representation of data:

    • Bar diagrams

    • Pie charts

    • Histograms

    Steps in Analysing Qualitative Data

    1. Meaning of Qualitative Data Analysis

    Qualitative data analysis refers to the systematic process of organising, interpreting, and deriving meaning from non-numerical data such as interviews, observations, field notes, and documents.

    2. Steps for Analysing Qualitative Data

    (a) Data Preparation and Organisation

    • Transcribing interviews and field notes

    • Organising data into texts or documents

    • Reading data repeatedly for familiarity

    (b) Coding

    • Assigning labels or codes to meaningful segments of data

    • Helps in identifying key ideas, concepts, and patterns

    (c) Categorisation

    • Grouping similar codes into broader categories

    • Reduces complexity and aids interpretation

    (d) Theme Identification

    • Identifying recurring themes and relationships

    • Themes represent important patterns in the data

    (e) Interpretation

    • Linking themes with research questions

    • Understanding meanings in social and economic contexts

    (f) Validation

    • Checking consistency of interpretations

    • Use of triangulation and participant feedback

    Analysis of Findings in Grounded Theory

    Meaning of Grounded Theory

    Grounded Theory is a qualitative research approach in which theory is developed inductively from data, rather than testing pre-existing theories.

    Steps in Analysing Findings from Grounded Theory

    (a) Open Coding

    • Breaking data into small units

    • Identifying concepts and initial categories

    (b) Axial Coding

    • Linking categories and sub-categories

    • Identifying causal relationships and conditions

    (c) Selective Coding

    • Identifying a core category

    • Integrating all categories around this central theme

    (d) Constant Comparative Method

    • Continuous comparison of data with emerging categories

    • Refines concepts and strengthens theory

    (e) Theory Development

    • Formulating a substantive theory grounded in data

    • Explaining processes, actions, or interactions

    Multicollinearity: Meaning, Detection and Implications

    1. Meaning of Multicollinearity

    Multicollinearity refers to a situation in a multiple regression model where two or more independent (explanatory) variables are highly correlated with each other. As a result, it becomes difficult to isolate the individual effect of each explanatory variable on the dependent variable.

    Multicollinearity can be:

    • Perfect multicollinearity: exact linear relationship (model cannot be estimated)

    • Imperfect (high) multicollinearity: strong but not exact correlation (model is estimable but problematic)

    2. Detection of Multicollinearity

    (a) Correlation Matrix

    • High pairwise correlation coefficients (close to ±1) among explanatory variables indicate multicollinearity.

    (b) Variance Inflation Factor (VIF)

    • Measures how much the variance of a coefficient is inflated due to multicollinearity.

    • Rule of thumb:

      • VIF > 10 → serious multicollinearity

    (c) T-statistics and R² Paradox

    • High R² but statistically insignificant individual coefficients

    • Indicates explanatory variables move together

    (d) Auxiliary Regression

    • Regress one independent variable on the others

    • A high R² suggests multicollinearity

    (e) Instability of Coefficients

    • Coefficient estimates change significantly with small changes in data or model specification

    3. Implications of Multicollinearity

    (a) Inflated Standard Errors

    • Makes coefficient estimates less precise

    (b) Insignificant t-values

    • Important variables may appear statistically insignificant

    (c) Unreliable Coefficient Estimates

    • Signs and magnitudes of coefficients may be counterintuitive

    (d) Difficulty in Interpretation

    • Hard to assess the individual impact of explanatory variables

    (e) Reduced Predictive Reliability

    • While overall fit (R²) may be high, predictions become less reliable

    Difference between Realism and Instrumentalism and Evaluation of Milton Friedman’s Instrumentalism

    1. Realism and Instrumentalism: Meaning

    (a) Realism

    Realism is a philosophical position which holds that:

    • Economic assumptions should be realistic and descriptively accurate

    • Models should reflect actual behaviour and real-world mechanisms

    • The truth or realism of assumptions matters for scientific explanation

    In realism, theories are judged by:

    • Plausibility of assumptions

    • Explanatory power

    • Correspondence with real economic behaviour

    (b) Instrumentalism

    Instrumentalism argues that:

    • The realism of assumptions is irrelevant

    • What matters is the predictive accuracy of a theory

    • Theories are merely tools (instruments) for prediction, not descriptions of reality

    Thus, even unrealistic assumptions are acceptable if predictions are accurate.

    2. Differences between Realism and Instrumentalism

    Basis Realism Instrumentalism
    View of assumptions Must be realistic Can be unrealistic
    Purpose of theory Explanation + prediction Prediction only
    Focus Truth and causality Usefulness
    Relation to reality Descriptive Pragmatic
    Evaluation criterion Realism + accuracy Predictive success

    Milton Friedman’s Instrumentalist Approach

    The most influential proponent of instrumentalism in economics is Milton Friedman.

    Friedman’s Core Argument

    In his essay “The Methodology of Positive Economics”, Friedman argued that:

    • Economic theories should not be judged by the realism of assumptions

    • Unrealistic assumptions are common and unavoidable

    • A theory is valid if it yields accurate predictions

    Example:

    • Firms are assumed to maximize profits, even if real firms do not consciously do so

    • The assumption is justified if it predicts firm behaviour correctly

    4. Critical Evaluation of Friedman’s Instrumentalism

    (a) Strengths

    1. Practical and pragmatic

      • Allows economists to build simple models

      • Encourages testable predictions

    2. Promotes empirical testing

      • Shifts focus from philosophical debates to observable outcomes
    3. Useful in policy analysis

      • Predictive models can guide decision-making even if assumptions are simplified

    (b) Criticisms

    1. Neglect of explanation

      • Accurate prediction does not guarantee correct explanation

      • Wrong mechanisms may produce right predictions temporarily

    2. Weakens causal understanding

      • Unrealistic assumptions may hide important institutional and behavioural factors
    3. Problem in complex social systems

      • In economics, predictions often fail due to changing contexts

      • Unrealistic assumptions reduce robustness

    4. Limits critical scrutiny

      • If assumptions are ignored, theories become difficult to challenge meaningfully

    Usefulness of Analysis of Variance (ANOVA) in the Regression Model

    1. Meaning of ANOVA in Regression

    In the context of a regression model, Analysis of Variance (ANOVA) is a statistical technique used to decompose the total variation in the dependent variable into components attributable to the regression model and to random error. It helps in assessing the overall significance and explanatory power of the regression equation.

    2. Decomposition of Variance in Regression

    ANOVA divides the Total Sum of Squares (TSS) into:

    $$ ext{TSS} = ext{ESS} + ext{RSS}$$ where:

    • TSS (Total Sum of Squares): total variation in the dependent variable

    • ESS (Explained Sum of Squares): variation explained by the regression model

    • RSS (Residual Sum of Squares): unexplained variation (error)

    3. Usefulness of ANOVA in Regression Analysis

    (a) Testing Overall Significance of the Model

    ANOVA provides the F-test, which tests the null hypothesis:

    $$H_0: eta_1 = eta_2 = cdots = eta_k = 0$$

    • It checks whether the explanatory variables jointly influence the dependent variable

    • Helps determine whether the regression model is meaningful

    (b) Measuring Explanatory Power

    From ANOVA, the coefficient of determination (R²) is obtained:

    $$R^2 = rac{ESS}{TSS}$$

    • Indicates the proportion of variation explained by the model

    • Higher R² implies better model fit

    (c) Comparison of Alternative Models

    • ANOVA allows comparison between restricted and unrestricted models

    • Useful in model selection and specification testing

    (d) Identifying Unexplained Variation

    • RSS highlights the magnitude of random error

    • Helps in diagnosing model inadequacy or omitted variables

    (e) Basis for Further Diagnostic Tests

    ANOVA provides a framework for:

    • Testing significance in multiple regression

    • Understanding goodness of fit before interpreting individual coefficients

    4. ANOVA Table in Regression (Illustrative)

    Source of Variation Sum of Squares Degrees of Freedom Mean Square
    Regression ESS k ESS / k
    Residual RSS n − k − 1 RSS / (n − k − 1)
    Total TSS n − 1

    Lorenz Curve as a Tool for Measuring Inequality

    1. Meaning of the Lorenz Curve

    The Lorenz Curve is a graphical tool used to measure inequality in the distribution of income or wealth in an economy. It shows the relationship between:

    • Cumulative percentage of population (on the X-axis), arranged from poorest to richest

    • Cumulative percentage of income or wealth (on the Y-axis)

    The farther the Lorenz curve lies from the line of equality, the greater is the inequality.

    https://upload.wikimedia.org/wikipedia/commons/5/59/Economics_Gini_coefficient2.svg?utm_source=chatgpt.com

    2. Lorenz Curve under Different Cases

    (a) Perfect Equality

    • Income is equally distributed among all individuals.

    • Each percentage of population receives the same percentage of income.

    Representation:

    • The Lorenz curve coincides with the 45° line, known as the line of perfect equality.

    • Example:

      • 20% of population earns 20% of income

      • 50% of population earns 50% of income

    Interpretation:

    • No inequality exists.

    • Gini coefficient = 0

    (b) Perfect Inequality

    • One individual (or household) receives all the income, while others receive nothing.

    Representation:

    • The Lorenz curve lies along the horizontal axis until the last individual.

    • At 100% population, income jumps suddenly to 100%.

    Interpretation:

    • Extreme inequality exists.

    • Gini coefficient = 1

    (c) Relative Inequality

    • Income is unequally distributed, but not perfectly unequal.

    • This is the most common real-world case.

    Representation:

    • The Lorenz curve lies between the line of perfect equality and the curve of perfect inequality.

    • The greater the bow away from the equality line, the higher the inequality.

    Interpretation:

    • Indicates partial inequality.

    • Allows comparison between:

      • Regions

      • Time periods

      • Countries

    3. Importance of the Lorenz Curve

    • Simple and visual measure of inequality

    • Helps in comparing income distributions

    • Basis for calculating the Gini coefficient

    • Widely used in economic and development studies

    Quasi-Participant Method over Simple Observation for Data Collection?

    1. Meaning of Quasi-Participant Method

    The quasi-participant method is a qualitative data collection approach in which the researcher partially participates in the social setting being studied while maintaining analytical distance. The researcher interacts with participants but does not become a full member of the group.

    2. Meaning of Simple Observation

    In simple observation, the researcher remains a detached observer, recording behaviour without interacting with participants. The role is passive, and understanding is limited to what is externally visible.

    3. Reasons Why Quasi-Participant Method is Preferred

    (a) Deeper Understanding of Social Reality

    • Quasi-participation allows the researcher to understand meanings, motivations, and perceptions

    • Observation alone captures only surface behaviour

    (b) Better Contextual Interpretation

    • Social and economic actions are context-dependent

    • Partial participation helps interpret actions within cultural, institutional, and social contexts

    (c) Access to Insider Information

    • Interaction builds rapport and trust

    • Participants may share experiences and explanations not visible through observation

    (d) Reduced Observer Bias

    • Pure observation may lead to misinterpretation of actions

    • Engagement allows clarification and verification of observed behaviour

    (e) Captures Dynamic Processes

    • Economic behaviour (e.g., labour relations, informal markets) involves processes over time

    • Quasi-participation captures changes, negotiations, and adaptations better than static observation

    (f) More Suitable for Development and Institutional Studies

    • Useful in studying:

      • Poverty

      • Informal sector

      • Rural institutions

    • These require understanding lived experiences, not just visible actions

    4. Limitations of Simple Observation

    • Limited insight into intentions and meanings

    • Risk of superficial conclusions

    • Cannot explain why people behave in a certain way

    Hierarchical and Non-Hierarchical Methods of Clustering

    Meaning of Clustering

    Clustering is a multivariate statistical technique used to group observations such that:

    • objects within a cluster are similar, and

    • objects between clusters are dissimilar,
      based on selected variables and a distance/similarity measure.

    1. Hierarchical Clustering Methods

    Definition

    Hierarchical clustering creates a nested sequence of clusters arranged in the form of a tree structure (dendrogram). Once a cluster is formed, it cannot be undone.

    Types

    1. Agglomerative (bottom-up)

      • Each observation starts as a separate cluster

      • Clusters are progressively merged

      • Most commonly used

    2. Divisive (top-down)

      • All observations start in one cluster

      • Clusters are progressively split

      • Computationally expensive and less common

    Common linkage criteria

    • Single linkage (nearest neighbour)

    • Complete linkage (farthest neighbour)

    • Average linkage

    • Ward’s method (minimises within-cluster variance)

    https://towardsdatascience.com/wp-content/uploads/2021/05/1VvOVxdBb74IOxxF2RmthCQ.png?utm_source=chatgpt.com

    https://media.geeksforgeeks.org/wp-content/uploads/20200204181551/Untitled-Diagram71.png?utm_source=chatgpt.com

    https://miro.medium.com/0%2ADl6yLLmT63wyV9O2?utm_source=chatgpt.com

    Features

    • Number of clusters not required in advance

    • Produces a dendrogram for visual interpretation

    • Sensitive to outliers and noise

    • Computationally intensive for large samples

    2. Non-Hierarchical (Partitioning) Clustering Methods

    Definition

    Non-hierarchical clustering divides the data into a pre-specified number of clusters (k) and allows reallocation of observations to improve cluster quality.

    Common methods

    • K-means clustering

    • K-medoids

    • ISODATA

    https://www.gatevidyalay.com/wp-content/uploads/2020/01/K-Means-Clustering.png?utm_source=chatgpt.com

    https://i.sstatic.net/VqdbM.png?utm_source=chatgpt.com

    https://media.geeksforgeeks.org/wp-content/uploads/20200205132750/11129.png?utm_source=chatgpt.com

    Features

    • Number of clusters must be specified beforehand

    • Observations can move between clusters

    • Efficient for large datasets

    • Sensitive to initial seed selection

    3. Differences between Hierarchical and Non-Hierarchical Clustering

    Basis Hierarchical Clustering Non-Hierarchical Clustering
    Nature Nested, tree-based Partition-based
    Number of clusters Not pre-specified Must be pre-specified
    Reallocation Not possible Possible
    Output Dendrogram Final cluster membership
    Computational cost High Relatively low
    Suitability Small datasets Large datasets
    Sensitivity to outliers High Moderate
    Interpretation Visual and intuitive Less visual

    4. Criteria for Choosing a Clustering Method

    1. Research Objective

    • Exploratory analysis → Hierarchical clustering

    • Classification or segmentation → Non-hierarchical clustering

    2. Size of Dataset

    • Small samples (e.g., village surveys, pilot studies) → Hierarchical

    • Large samples (e.g., NSS, NFHS datasets) → Non-hierarchical

    3. Knowledge of Number of Clusters

    • Unknown number of groups → Hierarchical

    • Known or policy-driven grouping (e.g., poor vs non-poor) → Non-hierarchical

    4. Need for Interpretability

    • If visual interpretation and structure matter → Hierarchical

    • If efficiency and final grouping matter → Non-hierarchical

    5. Presence of Outliers

    • If data has many outliers → Prefer Non-hierarchical methods

    6. Computational Constraints

    • Limited computing power → Non-hierarchical

    • Rich computing resources → Hierarchical possible

    5. Application in Economic Research

    • Hierarchical clustering:

      • Regional development analysis

      • Typology of states or districts

      • Exploratory poverty profiling

    • Non-hierarchical clustering:

      • Consumer segmentation

      • Labour market classification

      • Credit risk grouping

    National Family Health Survey (NFHS)

    The National Family Health Survey (NFHS) is one of India’s most comprehensive and authoritative large-scale household survey databases on population, health, and nutrition. It provides reliable, nationally representative data crucial for public health planning, policy formulation, and academic research.

    NFHS is conducted under the stewardship of the International Institute for Population Sciences with support from the Ministry of Health and Family Welfare (MoHFW), Government of India.

    2. NFHS as a Public Health Database

    Meaning and Scope

    NFHS is a repeated cross-section survey conducted at regular intervals to collect data on:

    • Population and demographic indicators

    • Health and nutrition outcomes

    • Maternal and child health

    • Reproductive health

    • Disease prevalence

    • Health service utilisation

    NFHS rounds include:

    • NFHS-1 (1992–93)

    • NFHS-2 (1998–99)

    • NFHS-3 (2005–06)

    • NFHS-4 (2015–16)

    • NFHS-5 (2019–21)

    Key Features of NFHS Database

    1. National and Sub-national Coverage

      • Representative at national, state, and district levels

      • Enables regional and inter-district comparisons

    2. Large Sample Size

      • Covers millions of individuals and households

      • Enhances statistical reliability

    3. Standardised Methodology

      • Uniform questionnaires across states

      • Allows comparison over time

    4. Rich Health Indicators

      • Fertility, mortality, family planning

      • Child nutrition (stunting, wasting, underweight)

      • Anaemia, obesity, hypertension

      • Immunisation and maternal care

    5. Gender-disaggregated Data

      • Special focus on women, children, and adolescents

    3. NFHS Variables Relevant to Public Health

    (a) Maternal and Child Health

    • Antenatal care

    • Institutional deliveries

    • Infant and child mortality

    • Breastfeeding practices

    (b) Nutrition and Anthropometry

    • Height-for-age (stunting)

    • Weight-for-height (wasting)

    • BMI, anaemia levels

    (c) Disease Burden

    • Hypertension

    • Diabetes

    • HIV knowledge

    • Tuberculosis awareness

    (d) Health Infrastructure and Access

    • Sanitation and drinking water

    • Health insurance coverage

    • Utilisation of public vs private healthcare

    4. Usefulness of NFHS for Researchers

    1. Evidence-Based Research

    NFHS provides high-quality secondary data, enabling researchers to conduct:

    • Public health studies

    • Demographic analysis

    • Nutritional epidemiology

    • Health inequality research

    2. Policy Evaluation

    Researchers use NFHS to:

    • Evaluate government programmes such as:

      • National Health Mission (NHM)

      • POSHAN Abhiyaan

      • Janani Suraksha Yojana

    • Assess outcomes before and after policy interventions

    3. Study of Health Inequalities

    NFHS data allows analysis across:

    • Income groups

    • Caste and religion

    • Rural–urban divide

    • Gender and regional disparities

    4. Time-Series and Trend Analysis

    Since NFHS is conducted periodically, researchers can:

    • Study changes in health indicators over time

    • Analyse long-term improvements or setbacks in public health

    5. Micro-level Econometric Analysis

    NFHS is widely used for:

    • Regression analysis

    • Logistic and probit models

    • Impact evaluation studies

    Examples

    • Determinants of child malnutrition

    • Effect of female education on fertility

    • Link between sanitation and child health

    6. Interdisciplinary Research

    NFHS supports research in:

    • Economics

    • Public health

    • Sociology

    • Gender studies

    • Development studies

    7. International Comparability

    NFHS follows global standards (DHS framework), allowing:

    • Cross-country comparisons

    • Global health benchmarking

    5. Limitations of NFHS (Brief)

    • Mostly cross-sectional in nature

    • Limited scope for causal inference

    • Self-reported health data may involve recall bias

    Despite these, NFHS remains the most reliable public health dataset in India.

    Comparative overview of Verification and Falsification.

    1. Introduction

    The debate between verification and falsification lies at the heart of the philosophy of science and directly influences how research hypotheses are formulated, tested, and evaluated in economics and social sciences. These approaches represent two contrasting views of scientific knowledge and progress.

    2. Verification Principle

    Meaning

    The verification principle holds that a scientific statement is meaningful only if it can be empirically verified through observation or experiment.

    This view is associated with logical positivism, particularly the Vienna Circle.

    Core Idea

    A theory is scientific if repeated observations confirm it.

    Key Features

    • Emphasis on induction

    • Knowledge grows through accumulation of confirming evidence

    • Statements must be empirically observable

    • Unobservable or metaphysical statements are considered meaningless

    Example (Economics)

    “Increase in income leads to increase in consumption.”

    If repeated data observations support this, the theory is considered verified.

    Limitations of Verification

    1. Problem of induction:
      No number of positive observations can conclusively prove a universal law.

    2. Cannot handle counter-examples adequately.

    3. Leads to confirmation bias, where researchers seek only supporting evidence.

    3. Falsification Principle

    Meaning

    The principle of falsification, proposed by Karl Popper, argues that a theory is scientific not because it can be verified, but because it can be falsified.

    Core Idea

    A theory is scientific if it makes risky predictions that could, in principle, be proven false.

    Key Features

    • Emphasis on deduction

    • Scientific knowledge grows through conjectures and refutations

    • One counter-example is sufficient to reject a theory

    • Clear demarcation between science and non-science

    Example (Economics)

    “Minimum wages always reduce employment.”

    If even one credible empirical case contradicts this, the universal claim is falsified.

    Strengths of Falsification

    • Avoids the problem of induction

    • Encourages critical testing

    • Promotes scientific rigor and objectivity

    Criticisms

    • In practice, theories are rarely rejected outright due to:

      • Measurement errors

      • Auxiliary assumptions

    • Social sciences often deal with probabilistic laws, not strict universals

    4. Verification vs Falsification: A Comparative Overview

    Basis Verification Falsification
    Philosophical base Logical Positivism Critical Rationalism
    Key thinker Vienna Circle Karl Popper
    Logic used Induction Deduction
    Criterion of science Confirmability Falsifiability
    Role of evidence Accumulates support Seeks refutation
    Status of theory Accepted if verified Accepted until falsified
    View of progress More confirmations Elimination of false theories
    Risk attitude Conservative Risk-embracing

    Popperian View of Verisimilitude

    Meaning of Verisimilitude

    Verisimilitude means truth-likeness or closeness to the truth.

    Popper acknowledged that:

    • Scientific theories are never perfectly true

    • Yet, science progresses by developing theories that are closer to the truth

    Popper’s Argument

    Even when a theory is falsified, it may still:

    • Explain more facts

    • Make more precise predictions

    • Have greater empirical content than earlier theories

    Thus, a new theory can be false yet better than an older one.

    Example

    • Newtonian mechanics is false at relativistic speeds

    • Yet it is more truth-like than Aristotelian physics

    • Einstein’s theory is even closer to truth

    How Verisimilitude Enables Scientific Progress

    • Science progresses without certainty

    • Replacement theories have:

      • Greater explanatory power

      • Higher falsifiability

      • Wider empirical scope

    Criticism of Popper’s Verisimilitude

    • Early formulations faced logical difficulties

    • Measuring “closeness to truth” is problematic

    • In social sciences, truth-likeness is often context-dependent

    Despite this, verisimilitude remains a powerful idea explaining progress without verification.

    6. Relevance to Economic Research

    • Econometric models are tested, not verified

    • Hypotheses are retained until falsified

    • Competing theories are compared based on:

      • Predictive power

      • Empirical adequacy

      • Explanatory scope

    Thus, economics largely follows a Popperian methodological stance in practice.

    Correspondence Analysis for analysing associations

    Correspondence Analysis (CA) is a multivariate statistical technique used to analyse and visually represent associations between categories of qualitative variables. It is especially useful when data are presented in the form of contingency tables.

    2. Usefulness of Correspondence Analysis in Analysing Associations

    1. Analysis of Association

    Correspondence analysis helps in:

    • Identifying patterns of association between row and column categories

    • Measuring the degree of similarity or dissimilarity among categories

    Categories located closer in the graphical display indicate stronger association.

    2. Graphical Representation

    CA provides a low-dimensional map (usually two-dimensional) where:

    • Rows and columns are displayed simultaneously

    • Associations are visually interpreted

    This is particularly useful for exploratory data analysis.

    3. Reduction of Data Complexity

    Large contingency tables are simplified into:

    • A few principal dimensions

    • Without significant loss of information

    This helps researchers interpret complex categorical relationships easily.

    4. Detection of Structure and Profiles

    Correspondence analysis identifies:

    • Row profiles (distribution of categories across columns)

    • Column profiles (distribution across rows)

    This helps uncover hidden structures in categorical data.

    5. Application in Social and Economic Research

    CA is widely used in:

    • Consumer preference analysis

    • Poverty and deprivation studies

    • Occupational structure analysis

    • Education and health surveys

    3. Applicability to Categorical Variables

    Yes, correspondence analysis applies specifically to categorical variables.

    • It is designed for nominal and ordinal variables

    • Works on frequency data from:

      • Two-way tables

      • Multi-way contingency tables

    Unlike regression analysis, CA does not require numerical measurement of variables.

    4. Advantages

    • Non-parametric in nature

    • No assumption of normality

    • Suitable for survey and census data

    • Complements chi-square tests by adding interpretation

    5. Limitations (Brief)

    • Mainly exploratory

    • Sensitive to rare categories

    • Interpretation may be subjective

    Canonical Correlation Analysis (CCA) help analyse interdisciplinary constructs

    1. Introduction

    Canonical Correlation Analysis (CCA) is a multivariate statistical technique used to study the relationship between two sets of variables simultaneously. Unlike simple correlation or regression, which examine relationships between individual variables, CCA captures the overall association between two multidimensional constructs, making it especially useful for interdisciplinary research.

    2. Meaning of Canonical Correlation Analysis

    CCA finds:

    • Linear combinations of variables in the first set (canonical variate UUU), and

    • Linear combinations of variables in the second set (canonical variate VVV)

    such that the correlation between UUU and VVV is maximised.

    $$U = a_1X_1 + a_2X_2 + dots + a_pX_p$$ $$V = b_1Y_1 + b_2Y_2 + dots + b_qY_q$$

    3. Why CCA is Useful for Interdisciplinary Constructs

    Interdisciplinary research often involves:

    • Multiple variables from different disciplines

    • Concepts that cannot be captured by a single indicator

    CCA helps by:

    1. Capturing Multidimensional Relationships

    It analyses sets of variables together, not one-to-one relationships.

    2. Integrating Different Disciplines

    Allows joint analysis of:

    • Economic variables

    • Social, psychological, health, or environmental variables

    within a single framework.

    3. Reducing Complexity

    Multiple correlated variables are reduced to a few canonical functions, making interpretation manageable.

    4. Avoiding Multiple Regression Problems

    Instead of running many regressions, CCA provides a single comprehensive association measure.

    4. Simple Interdisciplinary Example

    Example: Relationship between Socio-economic Status and Health Outcomes

    Set 1: Economic Variables

    • Income

    • Education

    • Employment status

    Set 2: Health Variables

    • Body Mass Index (BMI)

    • Anaemia status

    • Frequency of illness

    Using CCA:

    • A composite socio-economic index is created from income, education, and employment.

    • A composite health index is created from BMI, anaemia, and illness frequency.

    • CCA estimates the strength of association between these two indices.

    Interpretation

    • A high canonical correlation indicates that better socio-economic conditions are strongly associated with better health outcomes.

    • Helps economists, sociologists, and public-health researchers draw integrated conclusions.

    5. Applications in Research

    • Health economics

    • Education and labour market studies

    • Environment–economy linkages

    • Development and welfare analysis

    6. Limitations (Brief)

    • Requires large sample sizes

    • Interpretation can be complex

    • Assumes linear relationships

    Evaluation of Nominal, Ordinal, Interval and Ratio Scale Variables & Applicable Measures of Central Tendency

    Measurement scales classify variables based on the nature of information they contain and the type of mathematical operations permitted on them. The four commonly used scales—nominal, ordinal, interval, and ratio—differ in terms of ordering, distance, and origin, which in turn determines the appropriate measure of central tendency.

    2. Evaluation on Common Parameters

    Parameter Nominal Scale Ordinal Scale Interval Scale Ratio Scale
    Nature of data Qualitative Qualitative Quantitative Quantitative
    Classification Yes Yes Yes Yes
    Ordering No Yes Yes Yes
    Equal intervals No No Yes Yes
    True zero No No No Yes
    Meaningful ratios No No No Yes
    Examples Gender, caste Income groups, ranks Temperature (°C) Income, age, expenditure

    1. Nominal Scale Variables

    Nominal variables classify data into distinct categories without any order or ranking. The numbers or labels assigned are purely symbolic.

    Key features

    • No ordering

    • No arithmetic operations possible

    • Only equality/inequality can be checked

    Example (Economics)
    Type of occupation: farmer, labourer, self-employed

    Measure of central tendency: Mode

    2. Ordinal Scale Variables

    Ordinal variables classify data into categories that have a meaningful order, but the distance between categories is not equal or known.

    Key features

    • Ranking is possible

    • Differences are not measurable

    • Arithmetic operations not meaningful

    Example (Economics)
    Income groups: low, middle, high

    Measure of central tendency: Median and Mode

    3. Interval Scale Variables

    Interval variables have ordered categories with equal intervals, but they lack a true zero, so ratios are not meaningful.

    Key features

    • Order and equal spacing

    • Zero is arbitrary

    • Differences are meaningful, ratios are not

    Example (Economics)
    Consumer Price Index (CPI)

    Measure of central tendency: Mean, Median, Mode

    4. Ratio Scale Variables

    Ratio variables possess all the properties of interval scales and additionally have a true zero, making ratios meaningful.

    Key features

    • Order, equal intervals, true zero

    • All arithmetic operations possible

    Example (Economics)
    Income, consumption expenditure

    Measure of central tendency: Mean, Median, Mode

    3. Measures of Central Tendency Applicable

    (a) Nominal Scale

    • Only mode is applicable

    • Mean and median are meaningless because:

      • No numerical value

      • No ordering

    Example: Most common occupation in a village

    ✔ Applicable: Mode only

    (b) Ordinal Scale

    • Median and mode are applicable

    • Mean is not appropriate due to:

      • Lack of equal intervals

    Example: Poverty categories (low, medium, high)

    ✔ Applicable: Median, Mode

    (c) Interval Scale

    • Mean, median, and mode are applicable

    • Ratios are meaningless due to absence of true zero

    Example: Temperature in Celsius

    ✔ Applicable: Mean, Median, Mode

    (d) Ratio Scale

    • All measures of central tendency are applicable

    • Allows meaningful comparison using ratios

    Example: Monthly income, consumption expenditure

    ✔ Applicable: Mean, Median, Mode

    4. Summary Table: Scale vs Central Tendency

    Scale Mode Median Mean
    Nominal
    Ordinal
    Interval
    Ratio

    Mixed Methods Research

    1. Meaning of Mixed Methods Research

    Mixed Methods Research is an approach that integrates both quantitative and qualitative methods within a single study to gain a more comprehensive understanding of a research problem.

    It combines:

    • Quantitative methods (numbers, measurement, statistical analysis), and

    • Qualitative methods (meanings, experiences, perceptions).

    The core idea is that numbers explain “how much”, while qualitative insights explain “why and how.”

    2. Key Characteristics

    1. Methodological Integration

      • Uses surveys, experiments, econometrics along with interviews, focus groups, or case studies.
    2. Complementarity

      • Qualitative findings help interpret quantitative results, and vice versa.
    3. Triangulation

      • Cross-validates findings using different methods, increasing reliability.
    4. Flexibility

      • Can be sequential (one after the other) or concurrent (both together).

    3. Types of Mixed Methods Designs (Brief)

    • Sequential Explanatory: Quantitative → Qualitative

    • Sequential Exploratory: Qualitative → Quantitative

    • Concurrent Design: Both conducted simultaneously

    4. Illustration with an Example

    Example: Studying Female Labour Force Participation in Rural India

    Quantitative Component

    • Use NSS/NFHS data

    • Apply regression analysis to study effects of education, wages, and household income

    Qualitative Component

    • Conduct interviews and focus group discussions with rural women

    • Explore cultural norms, mobility constraints, and household decision-making

    Integration

    • Quantitative analysis may show low participation despite education

    • Qualitative findings explain this through social norms and unpaid care work

    Thus, mixed methods provide both statistical evidence and contextual understanding.

    5. Advantages

    • Richer and deeper analysis

    • Better policy relevance

    • Reduces bias of single-method studies

    • Especially useful in development and public policy research

    Hypothesis

    A hypothesis is a tentative, testable statement about the relationship between two or more variables. It is formulated to provide a possible explanation of a phenomenon and is subjected to empirical verification using data.

    In research methodology, a hypothesis acts as a guide to investigation, helping the researcher decide:

    • what data to collect,

    • how to analyze it, and

    • what conclusions to draw.

    Example (Economics):

    “An increase in education level leads to higher wages.”

    This statement can be tested using data on education and income.

    Definitions by Scholars

    • Goode and Hatt: “A hypothesis is a proposition which can be put to test to determine its validity.”

    • Kerlinger: “A hypothesis is a conjectural statement of the relationship between two or more variables.”

    Sources of Hypothesis

    Hypotheses do not arise randomly; they are derived from several intellectual and empirical sources:

    1. Theory

    Existing economic theories are the most important source of hypotheses.
    Example: Keynesian theory suggests a relationship between income and consumption, leading to hypotheses about marginal propensity to consume.

    2. Previous Studies and Literature

    Past research findings often suggest gaps, contradictions, or extensions that give rise to new hypotheses.

    3. Observation and Experience

    Real-world observations—such as regional inequality, unemployment trends, or inflation behavior—may lead researchers to frame hypotheses.

    4. Analogies

    Similarities between phenomena can suggest hypotheses.
    Example: Concepts from industrial organization applied to digital platforms.

    5. Intuition and Insight

    Researchers’ intellectual insight or creative thinking can generate hypotheses, though these must still be empirically tested.

    6. Social and Policy Problems

    Contemporary economic problems (poverty, inflation, unemployment, climate change) provide fertile ground for hypothesis formulation.

    Steps Involved in Testing a Hypothesis

    Hypothesis testing is a systematic statistical procedure used to decide whether empirical evidence supports or rejects a hypothesis.

    Step 1: Formulation of Hypothesis

    Two hypotheses are formulated:

    • Null Hypothesis (H₀):
      States that there is no relationship or no effect.
      Example:

      H₀: Education has no effect on wages.

    • Alternative Hypothesis (H₁ or Hₐ):
      States that a relationship or effect does exist.
      Example:

      H₁: Education positively affects wages.

    Step 2: Selection of Significance Level (α)

    The significance level represents the probability of rejecting a true null hypothesis.

    Common levels:

    • 1% (0.01)

    • 5% (0.05)

    • 10% (0.10)

    Economics research commonly uses 5%.

    Step 3: Choice of Appropriate Test Statistic

    Depending on the nature of data and sample size, an appropriate test is selected, such as:

    • Z-test

    • t-test

    • Chi-square test

    • F-test

    Step 4: Specification of Sampling Distribution

    The probability distribution of the test statistic under the null hypothesis is identified (normal, t, chi-square, etc.).

    Step 5: Computation of Test Statistic

    Using sample data, the value of the test statistic is calculated.

    Step 6: Determination of Critical Value / p-value

    • Critical value approach: Compare calculated value with table value.

    • p-value approach: Compare p-value with significance level.

    Step 7: Decision Rule

    • If calculated value > critical value → Reject H₀

    • If calculated value ≤ critical value → Fail to reject H₀

    Step 8: Conclusion and Interpretation

    The result is interpreted in the context of the research problem, policy relevance, or economic theory.

    Example:

    “The null hypothesis is rejected at the 5% level, indicating that education significantly affects wages.”

    Importance of Hypothesis in Research

    • Provides direction to research

    • Helps in theory testing

    • Ensures objectivity

    • Facilitates statistical analysis

    • Links theory with empirical evidence

      Semi-log and Log-linear Regression Models

    1. Introduction

    In econometric analysis, logarithmic transformations are widely used to:

    • linearise non-linear relationships

    • stabilise variance

    • interpret coefficients meaningfully

    Two commonly used models are the semi-log model and the log-linear model. Though both involve logarithms, they differ in structure and interpretation.

    2. Semi-log Model

    Definition

    In a semi-log model, only one variable (either dependent or independent) is expressed in logarithmic form.

    Types and Functional Form

    (a) Log-linear (dependent variable in log form)

    ln⁡Y=α+βX+uln Y = alpha + eta X + ulnY=α+βX+u

    (b) Linear-log (independent variable in log form)

    Y=α+βln⁡X+uY = alpha + eta ln X + uY=α+βlnX+u

    Interpretation

    • In log-linear form:
      βetaβ represents percentage change in Y for a one-unit change in X.

    • In linear-log form:
      βetaβ represents absolute change in Y for a 1% change in X.

    Applications

    • Wage determination models

    • Engel curve estimation

    • Growth analysis with dummy variables

    • Impact of education on earnings

    3. Log-linear (Double-log) Model

    Definition

    In a log-linear (double-log) model, both dependent and independent variables are expressed in logarithmic form.

    Functional Form

    ln⁡Y=α+βln⁡X+uln Y = alpha + eta ln X + ulnY=α+βlnX+u

    Interpretation

    • βetaβ is an elasticity

    • Measures percentage change in Y due to percentage change in X

    Applications

    • Demand and supply analysis

    • Production function estimation (Cobb–Douglas)

    • Export–import demand studies

    • Price elasticity estimation

    4. Distinction between Semi-log and Log-linear Models

    Basis Semi-log Model Log-linear (Double-log) Model
    Variables in log form One variable Both variables
    Functional form Mixed (log–linear or linear–log) Fully logarithmic
    Coefficient interpretation Semi-elasticity Elasticity
    Complexity Relatively simple More analytically rich
    Common use Growth, wage, policy impact Demand, production, trade

    5. Advantages of Logarithmic Models (Brief)

    • Reduces heteroscedasticity

    • Handles non-linearity

    • Improves normality of residuals

    • Facilitates economic interpretation

    Stratified random sampling & Purposive sampling

    Data collection methods determine how units are selected from a population for research. Sampling methods are broadly classified into probability sampling and non-probability sampling.
    Stratified random sampling belongs to probability sampling, while purposive sampling is a non-probability sampling method.

    (a) Stratified Random Sampling

    Meaning

    Stratified random sampling is a probability sampling technique in which the population is first divided into homogeneous sub-groups called strata, and then random samples are drawn from each stratum.

    The key idea is to ensure that all important sub-groups are adequately represented in the sample.

    Steps Involved

    1. Identify the target population

    2. Divide the population into mutually exclusive and exhaustive strata based on relevant characteristics (e.g., income, region, gender)

    3. Select samples randomly from each stratum

    4. Combine samples from all strata to form the final sample

    Types of Stratified Sampling

    • Proportionate stratified sampling: Sample size from each stratum is proportional to its population size

    • Disproportionate stratified sampling: Sample sizes differ intentionally to study smaller strata in detail

    Example (Economics)

    Suppose a researcher studies income inequality among households in India.

    • Population is divided into strata based on income groups:

      • Low income

      • Middle income

      • High income

    • Random samples are selected from each income group.

    This ensures that all income categories are represented, avoiding bias toward any one group.

    Merits

    • Ensures better representation

    • Reduces sampling error

    • Suitable for heterogeneous populations

    • Improves precision of estimates

    Limitations

    • Requires prior information about population

    • More complex and time-consuming

    • Incorrect stratification can distort results

    (b) Purposive Sampling

    Meaning

    Purposive sampling (also called judgmental sampling) is a non-probability sampling method in which the researcher selects units deliberately based on their relevance to the research objective.

    Selection depends on the researcher’s judgment and expertise, not randomization.

    Key Characteristics

    • No random selection

    • Focus on information-rich cases

    • Common in qualitative and exploratory research

    • Used when the population is small or specialized

    Types of Purposive Sampling

    • Expert sampling – selecting experts in a field

    • Typical case sampling – selecting representative cases

    • Critical case sampling – selecting crucial or extreme cases

    Example (Economics)

    Suppose a study examines policy challenges faced by Self-Help Groups (SHGs).

    The researcher deliberately selects:

    • SHG leaders

    • NGO coordinators

    • Local development officers

    These respondents are chosen because they possess specialized knowledge, not because they represent the population statistically.

    Merits

    • Useful when probability sampling is not feasible

    • Cost-effective and time-saving

    • Ideal for in-depth qualitative studies

    • Focuses on relevant respondents

    Limitations

    • High risk of researcher bias

    • Findings cannot be generalized

    • Lack of statistical representativeness

    Comparison between Stratified Random and Purposive Sampling

    Basis Stratified Random Sampling Purposive Sampling
    Type Probability sampling Non-probability sampling
    Selection Random Researcher’s judgment
    Representativeness High Limited
    Bias Low High
    Statistical inference Possible Not reliable
    Usage Quantitative studies Qualitative / exploratory studies

    Significance of Normal Distribution Assumption in Regression Analysis

    In classical linear regression analysis, one of the important assumptions is that the error term (disturbance term) follows a normal distribution with mean zero and constant variance. This assumption plays a crucial role in statistical inference, though not in the estimation of regression coefficients themselves.

    Meaning of the Assumption

    The normality assumption states that:

    ui∼N(0,σ2)u_i sim N(0, sigma^2)ui​∼N(0,σ2)

    where

    • uiu_iui​ is the error term,

    • the mean is zero, and

    • variance is constant.

    This implies that most errors cluster around zero, with extreme errors being rare.

    Significance of Normality Assumption

    1. Validity of Hypothesis Testing

    Normality ensures that:

    • t-tests for individual regression coefficients

    • F-tests for overall model significance

    are statistically valid, especially in small samples.

    2. Construction of Confidence Intervals

    Confidence intervals for regression coefficients rely on the assumption that estimators follow a normal or t-distribution, which is guaranteed when errors are normally distributed.

    3. Exact Sampling Distributions

    With normal errors, estimators such as OLS coefficients have exact sampling distributions, not just approximate ones. This improves reliability of inference.

    4. Importance in Small Samples

    In large samples, the Central Limit Theorem often compensates for non-normality. However, in small samples, normality becomes crucial for correct inference.

    5. Efficiency of Estimators

    Under normality, OLS estimators are not only Best Linear Unbiased Estimators (BLUE) but also maximum likelihood estimators, giving them desirable optimal properties.

    6. Prediction and Forecasting

    Normality of errors allows probabilistic statements about forecast errors, improving prediction accuracy and interpretation.

    What Normality Does Not Affect

    • It does not affect unbiasedness or consistency of OLS estimators.

    • Regression coefficients can still be estimated without normality.

    • Normality mainly affects inference, not estimation.

    Reciprocal Form of Regression Model

    The reciprocal regression model is a non-linear functional form in which the dependent variable is related to the reciprocal (inverse) of the independent variable. It is used when the effect of an explanatory variable on the dependent variable diminishes rapidly at first and then slowly.

    Specification of the Model

    The general reciprocal regression model is written as:

    Y=a+bX+uY = a + rac{b}{X} + uY=a+Xb​+u

    where:

    • YYY = dependent variable

    • XXX = independent variable

    • a,ba, ba,b = parameters

    • uuu = random error term

    If b<0b < 0b<0, the relationship between YYY and XXX is inverse.

    Explanation of the Reciprocal Relationship

    • As XXX increases, 1X rac{1}{X}X1​ decreases.

    • Changes in XXX have larger effects when XXX is small and smaller effects when XXX is large.

    • This captures diminishing influence of XXX on YYY.

    Applicability of Reciprocal Regression in Economics

    1. Average Cost and Output

    AC=a+bQAC = a + rac{b}{Q}AC=a+Qb​

    • As output (QQQ) increases, average fixed cost falls.

    • Widely used in cost theory.

    2. Productivity and Labour Input

    Output per worker=a+bL ext{Output per worker} = a + rac{b}{L}Output per worker=a+Lb​

    • Small increases in labour at low levels significantly affect productivity.

    • Effect weakens as labour increases.

    3. Interest Rate and Investment Efficiency

    I=a+brI = a + rac{b}{r}I=a+rb​

    • At very low interest rates, changes have strong effects on investment.

    • At higher rates, marginal impact declines.

    4. Time Taken and Speed

    T=a+bST = a + rac{b}{S}T=a+Sb​

    • Common in transport and logistics economics.

    • Increasing speed initially reduces time sharply; later reductions are smaller.

    5. Poverty or Inequality Measures

    • Used when improvements are rapid at low income levels but slow at higher levels.

    • Useful in development economics.

    Illustration (Simple Example)

    Suppose we model average cost (AC) as:

    AC=50+200QAC = 50 + rac{200}{Q}AC=50+Q200​

    • When Q=10Q = 10Q=10, AC=70AC = 70AC=70

    • When Q=100Q = 100Q=100, AC=52AC = 52AC=52

    This shows rapid decline initially and slow decline later, which linear models cannot capture well.

    https://i.sstatic.net/KqHix.png?utm_source=chatgpt.com

    https://www.economicshelp.org/wp-content/uploads/2007/12/cost-curves-mc-atc-avc-600x483.jpg?utm_source=chatgpt.com

    https://studyfinance.com/static/media/Inverse-Correlation-Graph.png?utm_source=chatgpt.com

    Merits of the Reciprocal Model

    • Captures non-linear diminishing effects

    • Economically intuitive in many real-world situations

    • Often provides better fit than linear models

    Limitations

    • Cannot be used when X=0X = 0X=0

    • Interpretation of coefficients is less direct than linear models

    • Sensitive to measurement errors in small values of XXX

      Distinguish

    (i) Oral Histories and Life History

    Basis Oral Histories Life History
    Meaning Collection of oral accounts of past events Detailed narrative of an individual’s entire life
    Focus Specific events or periods Whole life experiences
    Scope Narrow and event-centred Broad and comprehensive
    Time Coverage Limited time span Long-term (childhood to present)
    Use Historical and social research Sociological and developmental studies

    (ii) Participant Observation and Non-Participant Observation

    Basis Participant Observation Non-Participant Observation
    Role of researcher Actively participates in the group Remains detached and passive
    Interaction High interaction with subjects No or minimal interaction
    Depth of data Rich and in-depth Limited to observable behaviour
    Objectivity Risk of involvement bias More objective
    Suitability Cultural and community studies Behavioural and institutional studies

    (iii) Pooled Data and Panel Data

    Basis Pooled Data Panel Data
    Nature Combination of cross-section data Same units observed over time
    Time dimension No individual time tracking Has both time and individual dimensions
    Unit identity Not preserved Preserved
    Analysis Simpler regression Advanced econometric techniques
    Example Different households surveyed once Same households surveyed yearly

    (iv) Questionnaire and Schedule

    Basis Questionnaire Schedule
    Meaning A list of questions filled by the respondent A list of questions filled by the enumerator
    Mode Self-administered Interview-based
    Literacy requirement Requires literate respondents Suitable for illiterate respondents
    Cost Low cost Relatively expensive
    Response rate Usually low High
    Bias Less interviewer bias Possibility of interviewer bias
    Suitability Large, educated populations Rural areas and field surveys

    (v) Population Census and Economic Census

    Basis Population Census Economic Census
    Coverage Counts people Counts economic units/enterprises
    Objective Demographic and social information Structure of economic activities
    Conducted by Registrar General of India Ministry of Statistics & Programme Implementation
    Information collected Age, sex, literacy, occupation Type of enterprise, employment, location
    Frequency Once every 10 years Periodic (not fixed like population census)
    Use Planning social services Economic planning and industrial policy

    (vi) Primary Data and Secondary Data

    Primary Data

    • Collected first-hand by the researcher

    • Specific to the research objective

    • Costly and time-consuming

    Secondary Data

    • Already collected by others

    • General-purpose data

    • Economical and time-saving

    (vii) Population Census and Economic Census

    Population Census

    • Complete enumeration of people

    • Focuses on demographic and social characteristics

    • Unit of study: individual/household

    Economic Census

    • Complete enumeration of economic establishments

    • Focuses on economic activities and employment

    • Unit of study: enterprise/establishment

    (viii) Research Techniques and Research Tools

    Research Techniques

    • Procedures or methods used to conduct research

    • Indicate how data are collected or analysed

    • Examples: survey, interview, regression analysis

    Research Tools

    • Instruments used to implement techniques

    • Help in data collection or measurement

    • Examples: questionnaire, interview schedule, scale

    (ix) Measured Variable and Latent Variable

    Measured Variable

    • Directly observable and measurable

    • Data obtained through instruments

    • Example: income, years of education

    Latent Variable

    • Not directly observable

    • Inferred from measured variables

    • Example: poverty, intelligence, job satisfaction

    (x) Inclusion and Exclusion of Variables and Linkage with Adjusted R2R^2R2

    Inclusion of Variables

    • Adds explanatory factors to the model

    • May increase goodness of fit

    Exclusion of Variables

    • Removes irrelevant or insignificant variables

    • Helps avoid overfitting

    Link with Adjusted R2R^2R2

    • R2R^2R2 always increases when variables are added

    • Adjusted R2R^2R2 increases only if the new variable improves the model after adjusting for degrees of freedom

    • Used as a criterion for variable selection

    (xi) Life History vs Narratives

    Basis Life History Narratives
    Meaning Detailed account of an individual’s entire life Story or account of specific events or experiences
    Scope Long-term, comprehensive Shorter, focused
    Nature Chronological Thematic or episodic
    Usage Sociology, anthropology Qualitative social research
    Depth Very deep Relatively limited

    (xii) Sampling Errors vs Non-sampling Errors

    Basis Sampling Errors Non-sampling Errors
    Meaning Errors due to using a sample instead of population Errors arising from data collection and processing
    Occurrence Only in sample surveys In both census and sample surveys
    Cause Random variation Bias, non-response, measurement errors
    Control Reduced by larger samples Reduced by better survey design
    Nature Quantifiable Often non-quantifiable

    (xiii) Instrumentalism vs Realism

    Basis Instrumentalism Realism
    View of theories Tools for prediction True description of reality
    Assumptions Need not be realistic Must reflect real-world mechanisms
    Focus Predictive accuracy Explanatory power
    Associated with Milton Friedman Critical realism
    Criticism Ignores reality Difficult to empirically verify

    (xiv) Positive vs Normative Measures of Inequality

    Basis Positive Measures Normative Measures
    Nature Descriptive Value-based
    Objective Measure inequality as it exists Judge inequality against a standard
    Examples Gini coefficient, Lorenz curve Atkinson index
    Value judgment Absent Present
    Usage Empirical analysis Welfare evaluation

    (xv) Research Design vs Research Methods

    Basis Research Design Research Methods
    Meaning Overall plan of research Techniques used to collect and analyze data
    Nature Conceptual framework Operational tools
    Scope Broad Narrow
    Examples Exploratory, descriptive design Survey, interview, regression
    Stage Before data collection During data collection and analysis

    (xvi) Time-series Data vs Pooled Data

    Basis Time-series Data Pooled Data
    Meaning Observations on a single unit over time Combination of cross-section and time-series data
    Units Same unit repeatedly Multiple units over time
    Time dimension Essential Essential
    Example India’s GDP from 2010–2024 GDP of several states from 2010–2024
    Use Trend and forecasting Heterogeneity and richer inference

    (xvii) Action Research vs Exploratory Research

    Basis Action Research Exploratory Research
    Purpose Solving a practical problem Gaining initial understanding
    Nature Intervention-oriented Discovery-oriented
    Outcome Immediate action and change Hypothesis formulation
    Researcher role Active participant Detached investigator
    Example Improving school attendance Studying causes of low attendance

    (xviii) Induction vs Deduction Approach of Enquiry

    Basis Induction Deduction
    Direction From specific to general From general to specific
    Logic Observation → theory Theory → hypothesis → test
    Nature Theory-building Theory-testing
    Used in Qualitative research Quantitative research
    Example Observing markets to build theory Testing demand theory with data

    (xix) Nominal vs Ordinal Scaling Techniques

    Basis Nominal Scale Ordinal Scale
    Nature Classification only Classification with ranking
    Order No order Order exists
    Arithmetic operations Not possible Limited (no meaningful difference)
    Example Gender, religion Income groups (low, middle, high)
    Central tendency Mode Median, mode

    (xx) Ontology vs Epistemology

    Basis Ontology Epistemology
    Concern Nature of reality Nature of knowledge
    Key question What exists? How do we know?
    Focus Being and existence Knowledge and justification
    Research link What is reality being studied How reality can be studied
    Example Is poverty objective or subjective? Can poverty be measured statistically?