Unbalanced panel data?
Hi, I am doing a research based on panel data (unbalanced) of nine years and six dependent variables. The data has following issues
- For one country, two years of data is missing for three variables.
- For another country, there is no data for one variable.
- For another two countries, there is no data for one variable
How it should be tackled?
Imbalanced data set is nothing but skewed data set. The problem here is that if you use a statistical technique which is suitable for balanced data, it will result in wrong conclusions. There is no best approach or models for imbalanced data. However, here are some techniques to handle imbalanced data.
- Use the right evaluation metrics
- Resample the training set
- Use K-fold Cross-Validation in the Right Way
- Ensemble Different Resampled Datasets
- Resample with Different Ratios
- Cluster the abundant class
- Design Your Models
You could use your creativity and combine different approaches. Also, recheck if the raw data has become obsolete. All the best!