help@rskworld.in +91 93305 39277
RSK World
  • Home
  • Development
    • Web Development
    • Mobile Apps
    • Software
    • Games
    • Project
  • Technologies
    • Data Science
    • AI Development
    • Cloud Development
    • Blockchain
    • Cyber Security
    • Dev Tools
    • Testing Tools
  • About
  • Contact

Theme Settings

Color Scheme
Display Options
Font Size
100%
Back to Project
RSK World
statsmodels-statistical
RSK World
statsmodels-statistical
Statistical Modeling with Statsmodels
statsmodels-statistical
  • __pycache__
  • data
  • examples
  • notebooks
  • .gitignore458 B
  • CHANGELOG.md4 KB
  • FEATURES.md6.3 KB
  • LICENSE1.2 KB
  • PROJECT_INFO.md2.2 KB
  • PROJECT_SUMMARY.md4.2 KB
  • README.md7.4 KB
  • RELEASE_NOTES_v1.0.0.md6.5 KB
  • UNIQUE_FEATURES.md5.3 KB
  • advanced_time_series.py9.8 KB
  • automated_reporting.py8.3 KB
  • bayesian_statistics.py7.5 KB
  • data_preprocessing.py8.2 KB
  • econometric_modeling.py9.8 KB
  • hypothesis_testing.py12.5 KB
  • index.html10.8 KB
  • model_evaluation.py9.1 KB
  • model_persistence.py6.5 KB
  • model_selection.py9.7 KB
  • panel_data_analysis.py7.3 KB
  • performance_benchmarking.py7.3 KB
  • regression_analysis.py9 KB
  • requirements.txt361 B
  • statistical_diagnostics.py13.8 KB
  • statsmodels-statistical.png284 B
  • time_series_analysis.py10.3 KB
  • visualization_utils.py8.9 KB
panel_data_analysis.py
panel_data_analysis.py
Raw Download
Find: Go to:
"""
Panel Data Analysis

Author: RSK World
Website: https://rskworld.in
Email: help@rskworld.in
Phone: +91 93305 39277
"""

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy import stats
from statsmodels.regression.linear_model import OLS
from statsmodels.tools.tools import add_constant
import warnings
warnings.filterwarnings('ignore')


class PanelDataAnalysis:
    """
    Panel Data Analysis Tools
    
    Author: RSK World
    Website: https://rskworld.in
    Email: help@rskworld.in
    Phone: +91 93305 39277
    """
    
    def __init__(self):
        self.data = None
        self.entity_col = None
        self.time_col = None
    
    def prepare_panel_data(self, data, entity_col, time_col, value_cols):
        """
        Prepare data for panel analysis
        
        Parameters:
        -----------
        data : DataFrame
            Panel data
        entity_col : str
            Entity identifier column
        time_col : str
            Time identifier column
        value_cols : list
            Value columns to analyze
        """
        self.data = data.copy()
        self.entity_col = entity_col
        self.time_col = time_col
        
        # Set multi-index
        if not isinstance(data.index, pd.MultiIndex):
            self.data = self.data.set_index([entity_col, time_col])
        
        return self.data
    
    def fixed_effects_regression(self, y_col, X_cols):
        """
        Fixed Effects Panel Regression
        
        Parameters:
        -----------
        y_col : str
            Dependent variable column
        X_cols : list
            Independent variable columns
        """
        if self.data is None:
            raise ValueError("Data not prepared. Call prepare_panel_data() first.")
        
        # Create entity dummies
        entities = self.data.index.get_level_values(0).unique()
        entity_dummies = pd.get_dummies(self.data.index.get_level_values(0), 
                                       prefix='entity')
        entity_dummies.index = self.data.index
        
        # Combine features
        X = pd.concat([self.data[X_cols], entity_dummies], axis=1)
        y = self.data[y_col]
        
        # Fit model
        model = OLS(y, add_constant(X)).fit()
        
        print("Fixed Effects Panel Regression:")
        print("=" * 70)
        print(model.summary())
        
        return model
    
    def random_effects_regression(self, y_col, X_cols):
        """
        Random Effects Panel Regression
        
        Parameters:
        -----------
        y_col : str
            Dependent variable column
        X_cols : list
            Independent variable columns
        """
        if self.data is None:
            raise ValueError("Data not prepared. Call prepare_panel_data() first.")
        
        # Group by entity and calculate means
        entity_means = self.data.groupby(level=0)[X_cols + [y_col]].mean()
        
        # Demean data
        data_demeaned = self.data.copy()
        for col in X_cols + [y_col]:
            entity_mean = entity_means[col]
            data_demeaned[col] = data_demeaned[col] - data_demeaned.index.get_level_values(0).map(entity_mean)
        
        # Fit model on demeaned data
        X = data_demeaned[X_cols]
        y = data_demeaned[y_col]
        
        model = OLS(y, add_constant(X)).fit()
        
        print("Random Effects Panel Regression:")
        print("=" * 70)
        print(model.summary())
        
        return model
    
    def hausman_test(self, y_col, X_cols):
        """
        Hausman Test for Fixed vs Random Effects
        
        Parameters:
        -----------
        y_col : str
            Dependent variable column
        X_cols : list
            Independent variable columns
        """
        # Fixed effects
        fe_model = self.fixed_effects_regression(y_col, X_cols)
        
        # Random effects
        re_model = self.random_effects_regression(y_col, X_cols)
        
        # Extract coefficients
        fe_coef = fe_model.params[X_cols]
        re_coef = re_model.params[X_cols]
        
        # Calculate test statistic
        diff = fe_coef - re_coef
        fe_cov = fe_model.cov_params().loc[X_cols, X_cols]
        re_cov = re_model.cov_params().loc[X_cols, X_cols]
        cov_diff = fe_cov - re_cov
        
        try:
            test_stat = diff.T @ np.linalg.inv(cov_diff) @ diff
            df = len(X_cols)
            p_value = 1 - stats.chi2.cdf(test_stat, df)
            
            print("\nHausman Test:")
            print("=" * 70)
            print(f"Test Statistic: {test_stat:.4f}")
            print(f"Degrees of Freedom: {df}")
            print(f"p-value: {p_value:.4f}")
            
            if p_value < 0.05:
                print("Result: Reject null hypothesis - Use Fixed Effects")
            else:
                print("Result: Fail to reject null hypothesis - Use Random Effects")
            
            return {'test_statistic': test_stat, 'p_value': p_value, 'df': df}
        except:
            print("Hausman test could not be computed (singular matrix)")
            return None
    
    def plot_panel_data(self, y_col, entity_subset=None):
        """Plot panel data over time"""
        if self.data is None:
            raise ValueError("Data not prepared. Call prepare_panel_data() first.")
        
        if entity_subset is None:
            entities = self.data.index.get_level_values(0).unique()[:10]  # First 10
        else:
            entities = entity_subset
        
        plt.figure(figsize=(12, 6))
        
        for entity in entities:
            entity_data = self.data.loc[entity, y_col]
            if isinstance(entity_data, pd.Series):
                plt.plot(entity_data.index, entity_data.values, 
                        label=f'Entity {entity}', alpha=0.7)
        
        plt.xlabel('Time')
        plt.ylabel(y_col)
        plt.title('Panel Data Over Time')
        plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left')
        plt.grid(True, alpha=0.3)
        plt.tight_layout()
        plt.show()


if __name__ == "__main__":
    # Example usage
    print("Panel Data Analysis Example")
    print("=" * 70)
    
    # Generate sample panel data
    np.random.seed(42)
    n_entities = 5
    n_periods = 10
    
    data_list = []
    for entity in range(n_entities):
        for period in range(n_periods):
            data_list.append({
                'entity': entity,
                'time': period,
                'X1': np.random.randn(),
                'X2': np.random.randn(),
                'y': 2 + 1.5 * np.random.randn() + 0.8 * np.random.randn() + np.random.randn() * 0.5
            })
    
    df = pd.DataFrame(data_list)
    
    # Create panel analysis
    panel = PanelDataAnalysis()
    panel.prepare_panel_data(df, 'entity', 'time', ['X1', 'X2', 'y'])
    
    # Fixed effects
    fe_model = panel.fixed_effects_regression('y', ['X1', 'X2'])
    
    # Random effects
    re_model = panel.random_effects_regression('y', ['X1', 'X2'])
    
    # Hausman test
    panel.hausman_test('y', ['X1', 'X2'])

239 lines•7.3 KB
python

About RSK World

Founded by Molla Samser, with Designer & Tester Rima Khatun, RSK World is your one-stop destination for free programming resources, source code, and development tools.

Founder: Molla Samser
Designer & Tester: Rima Khatun

Development

  • Game Development
  • Web Development
  • Mobile Development
  • AI Development
  • Development Tools

Legal

  • Terms & Conditions
  • Privacy Policy
  • Disclaimer

Contact Info

Nutanhat, Mongolkote
Purba Burdwan, West Bengal
India, 713147

+91 93305 39277

hello@rskworld.in
support@rskworld.in

© 2026 RSK World. All rights reserved.

Content used for educational purposes only. View Disclaimer